LLM Evaluation Basics

This course led by Laurie Voss -- Head of Developer Relations at Arize AI and a former founder at NPM -- dives into the importance of evals in AI systems and how they serve as a testing mechanism for outputs that are inherently variable. It covers two types of evals: code evals for deterministic checks and LLM-as-a-Judge evals for more nuanced assessments. It then walks you through the process of setting up tracing, writing code evals, and configuring LLM judges, emphasizing the need for clear criteria and structured prompts. Start small with one code eval and one LLM eval to identify patterns and improve your outputs. Finally, explore the Arize-Phoenix documentation and evaluation tutorials to implement these strategies in your projects.

Free

Enroll Now

LLM Evaluation Basics

Course content

2 sections | 2 lessons

Lecture Module

Final Quiz

LLM Evaluation Basics

Lecture Module1 Lessons

Lecture Module

Final Quiz1 Lessons

Final Quiz