Skip to main content

Module 09: Agent Safety

Why Safety Is an Engineering Problem

In 2024, an Air Canada chatbot autonomously cited a bereavement refund policy that did not exist. Courts held the airline liable for what the agent said. The agent did exactly what it was designed to do - answer questions helpfully - and that was enough to create real legal and financial harm.

Agent safety is not a philosophical concern about future superintelligence. It is an immediate engineering discipline. Every agent you deploy today can take real actions in the world: send emails, call APIs, execute code, modify databases, charge credit cards, delete files. Each action carries risk. Safety engineering is how you keep that risk under control.

This module covers the seven topics you must understand to build agents you would be comfortable deploying in production.


Module Map


Lesson Guide

LessonTopicKey Skills
01Risk TaxonomyIdentify and score agent risks before they happen
02Minimal FootprintDesign agents with least privilege and reversibility
03Prompt InjectionDetect and block injection attacks in agent pipelines
04GuardrailsBuild composable validation pipelines for agent actions
05Human OversightDesign meaningful oversight without creating bottlenecks
06SandboxingIsolate agent execution environments at multiple levels
07Responsible AINavigate regulation, accountability, and ethics in practice

The Safety Mindset

Safe agent engineering requires a shift in how you think about failure. Traditional software fails by crashing or returning errors. Agents fail by doing the wrong thing convincingly. The agent that confidently books the wrong flight, deletes the wrong file, or sends the wrong email is more dangerous than one that errors out.

The tools in this module address this failure mode directly: limit what the agent can do, validate every action before execution, detect when something is going wrong, and always preserve the ability for humans to intervene.

By the end of this module, you will have the vocabulary, frameworks, and code patterns to build agents that are both capable and safe.

© 2026 EngineersOfAI. All rights reserved.