Skip to main content

16 docs tagged with "rag"

View all tags

Advanced RAG Patterns

Go beyond naive RAG - master query transformation, HyDE, multi-query retrieval, Self-RAG, Corrective RAG, and iterative retrieval patterns for complex questions.

Agentic RAG

Build agents that control their own retrieval - multi-step reasoning, router agents, ReAct loops, LangGraph stateful pipelines, and production patterns for agentic retrieval systems.

Document Chunking Strategies

Master the art and science of splitting documents into chunks that maximize retrieval precision - the most underestimated decision in RAG system design.

Embedding Models Deep Dive

Master embedding model selection for retrieval - MTEB benchmarks, model families, Matryoshka embeddings, bi-encoders vs cross-encoders, and fine-tuning strategies.

Graph RAG

Master Microsoft's Graph RAG - build knowledge graphs from documents, use community detection for global queries, and understand when graph structure beats flat vector search.

Hybrid Search: Dense and Sparse

Combine BM25 sparse retrieval with dense vector search for best-of-both-worlds performance - understand SPLADE, fusion methods, and when hybrid beats pure dense.

Module 04: RAG Systems

Master Retrieval-Augmented Generation - the dominant pattern for grounding LLMs in external knowledge at production scale.

Python for Vector Search

Embeddings, vector databases, similarity search, RAG pipelines, and production vector search in Python with FAISS, Chroma, Pinecone, and pgvector.

RAG Evaluation

Build rigorous RAG evaluation with RAGAS, TruLens, LLM-as-judge, golden datasets, and production monitoring - measure faithfulness, relevance, and groundedness.

Reranking

Master the two-stage retrieval-reranking architecture - cross-encoders, ColBERT, LLM-as-reranker, Reciprocal Rank Fusion, and production latency budgets.

Retrieval Algorithms and ANN

Master the approximate nearest neighbor algorithms powering vector search - HNSW, IVF, IVF-PQ, ScaNN, and DiskANN with parameter tuning and recall-latency trade-offs.

Securing RAG Systems

Attack surfaces unique to RAG architectures - document poisoning, retrieval hijacking, indirect prompt injection, embedding collision, cross-tenant leakage, and defense-in-depth strategies for production RAG deployments.

Vector Databases

Compare Pinecone, Qdrant, Weaviate, Milvus, Chroma, and pgvector - understand the engineering trade-offs and build a production vector store.

Why RAG and When Not To

Understand why LLMs hallucinate, what RAG actually solves, and the decision framework for choosing RAG vs fine-tuning vs prompt stuffing.