Skip to main content

Module 6 - Evaluating Open Models

Open LLM leaderboards, safety evaluation, hallucination testing, code generation benchmarks, and building custom eval harnesses.