Skip to main content

Interruption and Human-in-the-Loop

The $47,000 Mistake​

A team deployed an agent to automate their cloud infrastructure management. The agent had broad permissions: it could provision resources, modify configurations, and - crucially - delete unused resources to save costs.

The agent identified 47 EC2 instances as "unused" based on CPU utilization metrics. All 47 had CPU utilization under 5% for 30 days. The agent deleted them.

What the agent did not know: those instances were part of a disaster recovery cluster. Low CPU utilization was correct behavior - they were on standby. Recovery took three engineers two days. The deleted data required restoring from backups. Total cost: approximately $47,000 in engineering time.

The agent was technically doing exactly what it was built to do. The disaster was an architecture failure: no human judgment at the decision point where it mattered.

Human-in-the-loop (HITL) is not about distrust of agents. It is about knowing which decisions require human judgment and routing those decisions appropriately.


:::tip šŸŽ® Interactive Playground Visualize this concept: Try the Human-in-the-Loop Agents demo on the EngineersOfAI Playground - no code required. :::

Interruption Modes​

Different tasks call for different interruption strategies:

Checkpoint-Based Interruption​

Pause at predefined milestones, regardless of agent confidence. "After planning, before executing." "After building, before deploying."

Best for: workflows with natural phase boundaries where human review has clear value at each phase.

Trigger-Based Interruption​

Pause when a specific condition is met: confidence below threshold, action is irreversible, cost exceeds budget, or an exception occurs.

Best for: generally autonomous agents that should mostly run uninterrupted but need oversight for edge cases.

Time-Based Interruption​

Check in every N minutes or every N steps, regardless of what happened.

Best for: long-running processes where periodic oversight is required, or where accumulated small decisions might drift from intent.

Exception-Based Interruption​

Pause only when something goes wrong. Otherwise, run fully autonomous.

Best for: well-tested, high-confidence workflows where human review would add friction without value.


Action Classification: Safe vs. Dangerous​

The foundation of good HITL design is classifying every action the agent can take:

Classification dimensions:

  • Reversibility: can this be undone? (read = fully reversible; delete = irreversible)
  • Blast radius: how many things break if this is wrong? (edit one line vs. drop a table)
  • External effects: does this affect systems outside our control? (email sent = cannot unsend)
  • Financial impact: does this cost money or affect billing?

The Approval Protocol​

When the agent needs human approval, the information presented to the human matters:

What to show:

  1. What the agent is about to do (in plain English, not just code)
  2. Why it wants to do it (the goal it is pursuing)
  3. What happens if approved vs. denied
  4. Alternative approaches it considered
  5. Time sensitivity (how long can this wait?)

What NOT to do:

  • Show raw tool call arguments without explanation
  • Ask yes/no without offering alternatives
  • Interrupt for trivially low-stakes decisions
  • Make the approval UI so painful that humans approve everything without reading

Full Implementation: HITL System with Async Approval​

"""
human_in_the_loop.py
Production human-in-the-loop system with:
- Action risk classification
- Async approval workflow
- Slack notification + response
- Configurable interruption policies
- Resume-after-approval state management

Requirements:
pip install openai pydantic aiohttp slack-sdk
"""

from __future__ import annotations

import asyncio
import json
import logging
import time
import uuid
from dataclasses import dataclass, field
from enum import Enum
from typing import Any, Callable, Optional

from pydantic import BaseModel, Field

logger = logging.getLogger(__name__)


# ─── Action Risk Classification ───────────────────────────────────────────────

class RiskLevel(str, Enum):
SAFE = "safe" # Auto-execute, no approval needed
RISKY = "risky" # Log + execute, undo available
DANGEROUS = "dangerous" # Require explicit human approval


class ActionClass(BaseModel):
"""Metadata about a classified action."""
action_name: str
description: str
risk_level: RiskLevel
is_reversible: bool
blast_radius: str # "none", "local", "service", "organization"
has_external_effects: bool
financial_impact: bool = False
justification: str # why this risk level


# Static risk classification table
ACTION_RISK_TABLE: dict[str, ActionClass] = {
"read_file": ActionClass(
action_name="read_file",
description="Read a file from disk",
risk_level=RiskLevel.SAFE,
is_reversible=True,
blast_radius="none",
has_external_effects=False,
justification="Read-only operation with no side effects",
),
"search_web": ActionClass(
action_name="search_web",
description="Search the web",
risk_level=RiskLevel.SAFE,
is_reversible=True,
blast_radius="none",
has_external_effects=False,
justification="Read-only external call",
),
"write_file": ActionClass(
action_name="write_file",
description="Write or overwrite a file",
risk_level=RiskLevel.RISKY,
is_reversible=True,
blast_radius="local",
has_external_effects=False,
justification="Modifies local state; can be undone with version control",
),
"run_command": ActionClass(
action_name="run_command",
description="Execute a shell command",
risk_level=RiskLevel.RISKY,
is_reversible=False,
blast_radius="local",
has_external_effects=False,
justification="Commands may have side effects; depends on the specific command",
),
"send_email": ActionClass(
action_name="send_email",
description="Send an email",
risk_level=RiskLevel.DANGEROUS,
is_reversible=False,
blast_radius="organization",
has_external_effects=True,
justification="External communication cannot be unsent",
),
"delete_resource": ActionClass(
action_name="delete_resource",
description="Delete a resource (file, record, cloud resource)",
risk_level=RiskLevel.DANGEROUS,
is_reversible=False,
blast_radius="service",
has_external_effects=False,
justification="Data deletion may be permanent",
),
"deploy_to_production": ActionClass(
action_name="deploy_to_production",
description="Deploy code to production environment",
risk_level=RiskLevel.DANGEROUS,
is_reversible=True,
blast_radius="organization",
has_external_effects=True,
justification="Affects all users; rollback is possible but costly",
),
"charge_payment": ActionClass(
action_name="charge_payment",
description="Process a financial transaction",
risk_level=RiskLevel.DANGEROUS,
is_reversible=False,
blast_radius="organization",
has_external_effects=True,
financial_impact=True,
justification="Financial transactions are irreversible once processed",
),
}


def classify_action(action_name: str, arguments: dict[str, Any]) -> ActionClass:
"""
Classify an action by name. Falls back to dynamic risk assessment
for unknown actions using heuristics.
"""
if action_name in ACTION_RISK_TABLE:
return ACTION_RISK_TABLE[action_name]

# Heuristic classification for unknown actions
name_lower = action_name.lower()
if any(w in name_lower for w in ("delete", "drop", "remove", "destroy", "purge")):
return ActionClass(
action_name=action_name,
description=f"Unknown action: {action_name}",
risk_level=RiskLevel.DANGEROUS,
is_reversible=False,
blast_radius="service",
has_external_effects=False,
justification="Deletion keyword detected - treating as dangerous",
)
elif any(w in name_lower for w in ("send", "post", "publish", "notify", "email", "sms")):
return ActionClass(
action_name=action_name,
description=f"Unknown action: {action_name}",
risk_level=RiskLevel.DANGEROUS,
is_reversible=False,
blast_radius="organization",
has_external_effects=True,
justification="Communication keyword detected - treating as dangerous",
)
elif any(w in name_lower for w in ("read", "get", "fetch", "list", "search", "query")):
return ActionClass(
action_name=action_name,
description=f"Unknown action: {action_name}",
risk_level=RiskLevel.SAFE,
is_reversible=True,
blast_radius="none",
has_external_effects=False,
justification="Read keyword detected - treating as safe",
)
else:
# Unknown: treat as risky by default
return ActionClass(
action_name=action_name,
description=f"Unknown action: {action_name}",
risk_level=RiskLevel.RISKY,
is_reversible=True,
blast_radius="local",
has_external_effects=False,
justification="Unknown action - defaulting to risky",
)


# ─── Approval Request ─────────────────────────────────────────────────────────

class ApprovalStatus(str, Enum):
PENDING = "pending"
APPROVED = "approved"
DENIED = "denied"
MODIFIED = "modified" # User approved with modifications
TIMEOUT = "timeout"


class ApprovalRequest(BaseModel):
"""A request for human approval of a dangerous action."""
id: str = Field(default_factory=lambda: str(uuid.uuid4())[:12])
action_name: str
action_class: ActionClass
arguments: dict[str, Any]
agent_reasoning: str # Why the agent wants to do this
goal_context: str # What larger goal this supports
alternatives_considered: list[str] = Field(default_factory=list)
status: ApprovalStatus = ApprovalStatus.PENDING
decision: Optional[str] = None # Human's notes on their decision
modified_arguments: Optional[dict] = None # If human modifies the action
created_at: float = Field(default_factory=time.time)
decided_at: Optional[float] = None
timeout_seconds: float = 3600.0 # Default 1 hour timeout

@property
def is_expired(self) -> bool:
return time.time() > self.created_at + self.timeout_seconds

def format_for_human(self) -> str:
"""Format the approval request for human review."""
risk_emoji = {"safe": "🟢", "risky": "🟔", "dangerous": "šŸ”“"}[self.action_class.risk_level]
lines = [
f"{risk_emoji} ACTION APPROVAL REQUIRED",
f"Request ID: {self.id}",
"",
f"Action: {self.action_name}",
f"Risk level: {self.action_class.risk_level.upper()}",
f"Reversible: {'Yes' if self.action_class.is_reversible else 'NO - PERMANENT'}",
"",
"Arguments:",
]
for k, v in self.arguments.items():
lines.append(f" {k}: {v}")

lines.extend([
"",
f"Why: {self.agent_reasoning}",
f"Goal: {self.goal_context}",
])

if self.alternatives_considered:
lines.append("\nAlternatives considered:")
for alt in self.alternatives_considered:
lines.append(f" • {alt}")

lines.extend([
"",
f"ā± Timeout in: {int(self.timeout_seconds / 60)} minutes",
"",
"Options: APPROVE / DENY / MODIFY",
])
return "\n".join(lines)


# ─── Approval Backends ────────────────────────────────────────────────────────

class CLIApprovalBackend:
"""Simple CLI-based approval. Synchronous - blocks until user responds."""

async def request_approval(self, request: ApprovalRequest) -> ApprovalStatus:
print("\n" + "=" * 60)
print(request.format_for_human())
print("=" * 60)

while True:
raw = input("\nYour decision [approve/deny/modify]: ").strip().lower()
if raw in ("approve", "a", "yes", "y"):
request.status = ApprovalStatus.APPROVED
request.decided_at = time.time()
print("āœ… Approved")
return ApprovalStatus.APPROVED
elif raw in ("deny", "d", "no", "n"):
request.status = ApprovalStatus.DENIED
request.decided_at = time.time()
print("āŒ Denied")
return ApprovalStatus.DENIED
elif raw in ("modify", "m"):
print("Enter modifications as JSON (e.g., {\"path\": \"/tmp/test.txt\"}):")
try:
mods = json.loads(input())
request.modified_arguments = {**request.arguments, **mods}
request.status = ApprovalStatus.MODIFIED
request.decided_at = time.time()
print("āœļø Modified and approved")
return ApprovalStatus.MODIFIED
except json.JSONDecodeError:
print("Invalid JSON - try again")
else:
print("Please enter 'approve', 'deny', or 'modify'")


class SlackApprovalBackend:
"""
Slack-based async approval.
Agent sends a message to Slack, waits for a response, resumes.
"""

def __init__(self, webhook_url: str, channel: str = "#agent-approvals"):
self.webhook_url = webhook_url
self.channel = channel
self._pending: dict[str, ApprovalRequest] = {}

async def request_approval(self, request: ApprovalRequest) -> ApprovalStatus:
"""
Send approval request to Slack.
Poll for response until approved, denied, or timeout.
"""
self._pending[request.id] = request
await self._send_slack_message(request)

# Poll for response (in production, use Slack event webhooks instead)
start = time.time()
while time.time() - start < request.timeout_seconds:
if request.status != ApprovalStatus.PENDING:
return request.status
await asyncio.sleep(5)

request.status = ApprovalStatus.TIMEOUT
return ApprovalStatus.TIMEOUT

async def _send_slack_message(self, request: ApprovalRequest) -> None:
"""Send formatted Slack message with approve/deny buttons."""
import aiohttp # type: ignore

message = {
"channel": self.channel,
"text": f"Agent approval required: {request.action_name}",
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": f"*Agent Action Approval Required*\n\n"
f"Action: `{request.action_name}`\n"
f"Risk: *{request.action_class.risk_level.upper()}*\n"
f"Reversible: {'Yes' if request.action_class.is_reversible else ':warning: NO'}\n\n"
f"*Why:* {request.agent_reasoning}\n"
f"*Goal:* {request.goal_context}",
},
},
{
"type": "actions",
"elements": [
{
"type": "button",
"text": {"type": "plain_text", "text": "Approve"},
"style": "primary",
"value": f"approve:{request.id}",
},
{
"type": "button",
"text": {"type": "plain_text", "text": "Deny"},
"style": "danger",
"value": f"deny:{request.id}",
},
],
},
],
}

try:
async with aiohttp.ClientSession() as session:
await session.post(self.webhook_url, json=message)
except Exception as e:
logger.warning(f"Failed to send Slack notification: {e}")

def handle_slack_action(self, action_value: str) -> None:
"""Called when a user clicks Approve/Deny in Slack."""
parts = action_value.split(":")
action, request_id = parts[0], parts[1]
request = self._pending.get(request_id)
if request:
if action == "approve":
request.status = ApprovalStatus.APPROVED
elif action == "deny":
request.status = ApprovalStatus.DENIED
request.decided_at = time.time()


# ─── Interrupt Policy ─────────────────────────────────────────────────────────

@dataclass
class InterruptPolicy:
"""
Configures when and how the agent interrupts for human approval.
"""
# Action-based interruption
require_approval_for: list[RiskLevel] = field(
default_factory=lambda: [RiskLevel.DANGEROUS]
)

# Confidence-based interruption
confidence_threshold: float = 0.7 # Ask if agent confidence < this

# Cost-based interruption
max_cost_before_approval: float = 1.0 # USD - pause if about to exceed this

# Time-based interruption
check_in_every_n_steps: Optional[int] = None # None = no time-based checks

# Exception-based interruption (always on)
pause_on_exception: bool = True

# Autonomy level
fully_autonomous: bool = False # If True, only pause on DANGEROUS actions


# ─── HITL Agent Wrapper ────────────────────────────────────────────────────────

class HITLAgent:
"""
Wraps any agent executor with human-in-the-loop interruption logic.
Intercepts tool calls, classifies them, and requests approval when needed.
"""

def __init__(
self,
policy: InterruptPolicy,
approval_backend,
agent_goal: str = "",
):
self.policy = policy
self.backend = approval_backend
self.agent_goal = agent_goal
self._step_count = 0
self._total_cost = 0.0
self._action_log: list[dict] = []

async def execute_action(
self,
action_name: str,
arguments: dict[str, Any],
actual_executor: Callable,
agent_reasoning: str = "",
confidence: float = 1.0,
) -> Any:
"""
Execute an action with HITL interruption logic.
Returns the action result (or raises if denied).
"""
self._step_count += 1
action_class = classify_action(action_name, arguments)

# Log every action
self._action_log.append({
"step": self._step_count,
"action": action_name,
"risk": action_class.risk_level,
"arguments": arguments,
"timestamp": time.time(),
})

# Decide if we need approval
needs_approval = self._should_interrupt(action_class, confidence)

if needs_approval:
request = ApprovalRequest(
action_name=action_name,
action_class=action_class,
arguments=arguments,
agent_reasoning=agent_reasoning or f"Executing {action_name} as part of task",
goal_context=self.agent_goal,
timeout_seconds=3600.0,
)

print(f"\nāøļø Pausing for approval: {action_name} (risk: {action_class.risk_level.upper()})")
status = await self.backend.request_approval(request)

if status == ApprovalStatus.DENIED:
raise PermissionError(f"Action '{action_name}' denied by human reviewer")
elif status == ApprovalStatus.TIMEOUT:
raise TimeoutError(f"Approval for '{action_name}' timed out after {request.timeout_seconds}s")
elif status == ApprovalStatus.MODIFIED and request.modified_arguments:
arguments = request.modified_arguments
print(f" Action modified by reviewer, proceeding with: {arguments}")
else:
print(f" āœ… Action approved by human reviewer")

# Check for periodic check-in
elif (
self.policy.check_in_every_n_steps
and self._step_count % self.policy.check_in_every_n_steps == 0
):
await self._periodic_check_in()

# Execute the action
result = actual_executor(action_name=action_name, **arguments)
print(f" āœ… Executed: {action_name}")
return result

def _should_interrupt(self, action_class: ActionClass, confidence: float) -> bool:
"""Determine if this action requires human approval."""
if self.policy.fully_autonomous:
# Only interrupt for DANGEROUS actions in fully autonomous mode
return action_class.risk_level == RiskLevel.DANGEROUS

if action_class.risk_level in self.policy.require_approval_for:
return True

if confidence < self.policy.confidence_threshold:
return True

if self._total_cost > self.policy.max_cost_before_approval:
return True

return False

async def _periodic_check_in(self) -> None:
"""Periodic check-in with human - show progress, allow pause or redirect."""
print(f"\nšŸ“Š Periodic check-in at step {self._step_count}")
print(f" Actions taken: {self._step_count}")
last_actions = self._action_log[-5:]
print(" Recent actions:")
for action in last_actions:
print(f" [{action['risk'].upper()}] {action['action']}")

response = input("\n Continue? [yes/pause/stop]: ").strip().lower()
if response in ("stop", "s"):
raise InterruptedError("Agent stopped by human at periodic check-in")
elif response in ("pause", "p"):
print(" Agent paused. Call resume() to continue.")
# In production: save state, exit loop


# ─── Demo ─────────────────────────────────────────────────────────────────────

async def demo_hitl():
"""
Demonstrates HITL in action - agent asks for approval before dangerous actions.
"""
policy = InterruptPolicy(
require_approval_for=[RiskLevel.DANGEROUS],
confidence_threshold=0.6,
check_in_every_n_steps=5,
)

backend = CLIApprovalBackend()
agent = HITLAgent(
policy=policy,
backend=backend,
agent_goal="Set up automated deployment pipeline for the todo app",
)

# Simulate a sequence of actions the agent wants to take
planned_actions = [
{
"name": "read_file",
"args": {"path": "src/main.py"},
"reasoning": "Need to understand current application structure",
"confidence": 0.95,
},
{
"name": "write_file",
"args": {"path": ".github/workflows/deploy.yml", "content": "...CI config..."},
"reasoning": "Creating GitHub Actions workflow for automated deployment",
"confidence": 0.90,
},
{
"name": "deploy_to_production",
"args": {"service": "todo-api", "version": "v2.1.0", "region": "us-east-1"},
"reasoning": "Deploying the new version after successful CI tests",
"confidence": 0.88,
},
{
"name": "send_email",
"args": {"to": "[email protected]", "subject": "Deployment complete", "body": "v2.1.0 deployed"},
"reasoning": "Notifying the team that the deployment succeeded",
"confidence": 0.85,
},
]

def mock_executor(action_name: str, **kwargs) -> str:
return f"Executed {action_name} successfully"

print("\n=== HUMAN-IN-THE-LOOP DEMO ===")
print("Agent will ask for approval before dangerous actions.\n")

for action_config in planned_actions:
try:
result = await agent.execute_action(
action_name=action_config["name"],
arguments=action_config["args"],
actual_executor=mock_executor,
agent_reasoning=action_config["reasoning"],
confidence=action_config["confidence"],
)
print(f" Result: {result}")
except PermissionError as e:
print(f" ā›” {e} - skipping this action")
except Exception as e:
print(f" āŒ Error: {e}")


if __name__ == "__main__":
asyncio.run(demo_hitl())

Interruption Decision Flowchart​


Resuming After Interruption​

When an agent resumes after a human pause, it needs to re-establish context:

  1. Load the checkpoint: task graph, completed steps, current state
  2. Summarize the interruption: what was being done, what decision was made
  3. Continue execution: from the step that was paused

The agent should NOT restart planning from scratch. If the human approved the action, execute it and continue. If the human denied it, mark that action as skipped and let the planner determine how to proceed without it.

For long pauses (hours to days), add a context refresh step: re-read any files that might have changed, re-verify external state, before continuing.


Production Notes​

:::warning Approval Fatigue If the agent asks for approval too often, humans start approving everything without reading. This is called "approval fatigue" and makes the HITL system worse than useless. Keep DANGEROUS actions genuinely rare by designing agents to prefer reversible alternatives. The goal is meaningful oversight, not rubber-stamping. :::

:::danger Never Default to Approve on Timeout When an approval times out, the safe default is to pause the run, not to auto-approve. An auto-approve-on-timeout policy is especially dangerous for financial operations. Build in explicit timeout handling that escalates (try another reviewer) rather than proceeding. :::

Calibration: track the rate at which humans approve vs. deny vs. modify actions. A high approval rate (>95%) might indicate over-interruption (calibrate the risk threshold up). A high modify rate might indicate the agent needs better action design. A high deny rate might indicate the agent misunderstands the goal.


Interview Questions and Answers​

Q: When should agents have human-in-the-loop, and when should they operate autonomously?

A: The deciding factors are reversibility, blast radius, and external effects. Fully autonomous is appropriate when: all actions are reversible (can be undone if wrong), the blast radius is local (only affects the current task, not other users or systems), and there are no external communications. HITL is needed when: actions are irreversible (data deletion, emails, payments), the blast radius is large (affects production, many users), or external parties are involved. Most production agents should be HITL for dangerous actions and autonomous for everything else - the goal is to interrupt humans only when their judgment adds genuine value.

Q: How do you prevent approval fatigue?

A: Three strategies. First, genuinely minimize dangerous actions in the agent design - prefer reversible alternatives (stage to dev before prod, archive before delete, draft emails for review). Second, calibrate thresholds carefully - track approval rates and adjust the risk classification if humans are approving >95% of requests. Third, improve action specificity - instead of "deploy to production?" present "deploy version 2.1.0 to us-east-1, affecting 12,000 users, with 30-minute rollback available" so humans can make informed decisions quickly. Approval fatigue happens when reviewers cannot quickly assess what they are approving.

Q: How do you implement async approval (agent waits without blocking)?

A: Use an asyncio event or a database-backed polling loop. The agent sends an approval request (to Slack, email, or a web UI) with a unique request ID, then saves its state and parks. A separate webhook endpoint receives the human's response, updates the approval record in the database, and (optionally) triggers the parked agent to resume. In practice with asyncio, I use asyncio.Event: the approval backend sets the event when a response arrives, and the agent awaits the event. With multiple workers, use a message queue (Redis pub/sub, SQS) to notify the correct worker.

Q: How do you handle the case where a human denies an action that is on the critical path?

A: First, I never block silently - I report the denial clearly and its implications. Then I trigger replanning: "Human denied {action}. Replan the remaining work without assuming this action will be taken." The replanner receives context about what was denied and why (if the human provided a reason), and generates an alternative plan. For actions that genuinely cannot be skipped (e.g., "the user denied deleting the old database, but the new schema is incompatible"), the agent escalates to the human with an explicit statement: "I cannot proceed without deleting the old table - please clarify or allow the deletion."

Q: What is the difference between checkpoint-based and trigger-based interruption modes?

A: Checkpoint-based interruption pauses at predefined phase boundaries regardless of agent state - "before deployment," "after planning," "after writing tests." It is predictable but may interrupt even when things are going perfectly. Trigger-based interruption pauses when a specific condition is met - confidence below threshold, dangerous action detected, cost limit reached. It interrupts only when needed but requires good condition design. In practice, I combine them: checkpoint-based for the major phase transitions (planning complete → execution start → deployment → notification) where human review is always valuable, and trigger-based within phases for edge cases.

Ā© 2026 EngineersOfAI. All rights reserved.