Module 6 - AI Security

AI systems introduce a new class of security vulnerabilities that don't map cleanly onto traditional software security. A SQL injection attack exploits predictable, deterministic behavior. Prompt injection exploits the fact that LLMs cannot reliably distinguish between instructions and data. That fundamental ambiguity is what makes AI security uniquely difficult - and why every engineer building production AI systems needs to understand it deeply.

This module covers the full threat landscape: how attackers exploit AI systems, how defenders detect and mitigate attacks, and how to build AI systems that are secure by design. Each lesson pairs theoretical understanding with production-grade code.

Threat Landscape

Lessons in This Module

#	Lesson	What You Will Learn
01	Prompt Injection	Direct vs indirect injection, instruction hierarchy attacks, real-world exploits (Bing Chat, ChatGPT plugins), detection and defense layers
02	Jailbreaks and Bypasses	DAN prompts, role-play exploits, token smuggling, many-shot jailbreaking, red-team taxonomy, alignment challenges
03	Data Poisoning	Backdoor attacks, trigger patterns, clean-label attacks, Hugging Face supply chain risks, RAG index poisoning
04	Model Extraction	Query-based extraction, distillation attacks, API rate limiting, watermarking, legal and IP considerations
05	Membership Inference	Shadow model attacks, likelihood-ratio tests, LLM memorization (Carlini et al.), differential privacy mitigations
06	Adversarial Examples	GCG and AutoDAN token attacks, embedding space attacks, adversarial suffixes, robustness evaluation
07	Red Teaming AI Systems	Manual vs automated red teaming, PyRIT, Garak, red team composition, operationalizing in SDLC
08	Securing RAG Systems	Prompt injection via documents, index poisoning, vector DB access control, PII leakage, multi-tenant security
09	AI Security Governance	NIST AI RMF, EU AI Act, model cards, responsible disclosure, AI bug bounties, SBOM for AI

Why AI Security Is Different

Traditional application security has decades of established patterns: sanitize inputs, parameterize queries, enforce least privilege, patch known CVEs. AI security breaks every one of these assumptions:

No clear input/instruction boundary - LLMs process instructions and data in the same token stream
Non-deterministic behavior - the same attack may succeed 30% of the time, making testing harder
Opaque internals - you cannot inspect what a model "knows" or "intends" at inference time
Training data as attack surface - the model itself can be compromised before deployment
Emergent capabilities - models do things they were not explicitly trained to do, including unsafe things

This module treats AI security as a first-class engineering discipline, not an afterthought.

Threat Landscape​

Lessons in This Module​

Why AI Security Is Different​

Threat Landscape

Lessons in This Module

Why AI Security Is Different