Skip to main content

251 docs tagged with "agents"

View all tags

Agent Evaluation

Measuring LLM agent performance through trajectory analysis, benchmark suites, LLM-as-judge, failure taxonomies, and production monitoring strategies.

Agent Safety and Guardrails

Implementing defense-in-depth safety for production LLM agents - prompt injection defense, input/output guardrails, tool sandboxing, HITL confirmation, and audit logging.

Do LLMs Benefit From Their Own Words?

Multi-turn interactions with large language models typically retain the assistant's own past responses in the conversation history. In this work, we rev...

LangChain Deep Dive

A thorough guide to LangChain's core abstractions, LCEL composable pipelines, LangGraph stateful workflows, LangSmith observability, and when to use LangChain vs direct API calls.

LlamaIndex Deep Dive

A comprehensive guide to LlamaIndex's data-centric architecture - indices, query engines, workflows, multi-document agents, and how it compares to LangChain for RAG applications.

Model Agreement via Anchoring

Numerous lines of aim to control $ extit{model disagreement}$ -- the extent to which two machine learning models disagree in their predictions. We adop...

Multi-Agent Architectures

Building systems where multiple specialized LLM agents collaborate through orchestrator-worker, pipeline, and peer-to-peer patterns using LangGraph and CrewAI.

Planning and Reasoning

How LLM agents handle complex multi-step tasks through plan-and-execute, hierarchical planning, self-reflection, and LangGraph-based workflows.

ReAct Agent Pattern

Building LLM agents that interleave reasoning traces and actions in a ReAct loop to solve multi-step tasks with tool grounding.

Semantic Invariance in Agentic AI

Large Language Models (LLMs) increasingly serve as autonomous reasoning agents in decision support, scientific problem-solving, and multi-agent coordina...

Tool Use and Function Calling

Enabling LLMs to invoke external tools and APIs through structured function calling, covering JSON schema design, Anthropic vs OpenAI formats, parallel tool calls, and production safety.

Tool Use from Python

Building LLM tool use systems in Python -- function calling, tool schemas, execution loops, error handling, and multi-step agent patterns.