8 docs tagged with "structured-generation"

Constrained Decoding - How It Works

The mathematics of constrained decoding - finite-state machines, token masking, context-free grammars, and how the Outlines library achieves guaranteed JSON schema conformance at generation time.

Instructor - Structured Outputs with Pydantic

A complete guide to Jason Liu's Instructor library - Pydantic-based structured extraction, automatic retry on validation failure, multi-provider support, streaming, and production extraction patterns.

JSON Mode and Tool/Function Schemas

A complete guide to native JSON mode, OpenAI Structured Outputs, tool calling for structured data, Anthropic tool use, parallel tool calls, and schema design best practices.

LMQL and Guidance - Programmatic LLM Control

How Microsoft Guidance and LMQL extend structured generation to full programmatic control - interleaving generation with code, SQL-like constraints, token healing, and when each tool wins over Outlines and Instructor.

Module 13: Structured Generation

A complete map of structured generation - from the reliability problem with free-text LLM output to constrained decoding, Outlines, Instructor, JSON mode, and production-grade extraction pipelines.

Outlines - Grammar-Constrained Generation

A complete guide to the Outlines library - Pydantic schema to FSM, regex constraints, JSON schema constraints, vLLM integration, and production deployment patterns with guaranteed output conformance.

Structured Generation in Production

Production-grade architecture for structured generation pipelines - reliability stacks, schema versioning, monitoring, async batching, caching, edge case handling, and complete reference implementations.

Why Structured Output Matters in Production

The taxonomy of LLM output failures, why prompt-based JSON extraction breaks at scale, the production impact of 5% failure rates, and the spectrum of solutions from prompt engineering to constrained decoding.