Skip to main content

:::tip 🎮 Interactive Playground Visualize this concept: Try the AI Safety Evals demo on the EngineersOfAI Playground - no code required. :::

AI Security Governance

Reading time: ~28 min  |  Interview relevance: High  |  Target roles: AI Engineer, Engineering Manager, CISO, AI Safety Lead, Applied Scientist

The Audit That Revealed Everything

The AI governance audit had been scheduled for months. The consulting firm arrived with a 47-page questionnaire and a team of six. Three days in, the lead auditor called the CTO for an urgent meeting.

The company had deployed 14 AI systems over the past three years: customer service bots, fraud detection, content moderation, hiring screening, credit scoring, document summarization, predictive maintenance, churn prediction. Each system had been built by a different team, often with different vendors, with different levels of documentation. Some were open-source models hosted on internal infrastructure. Some were API calls to large providers. Some were fine-tuned on proprietary data. Several had been iterated significantly since their initial deployment. One had been in production for three years and nobody was entirely certain which model version it was running.

The findings were stark: of the 14 systems, exactly three had written security assessments. Two had formal model cards. One had a documented incident response process for AI failures. None had a privacy impact assessment for training data. Seven were processing personal data of EU residents - putting the company in scope for GDPR's AI provisions. Two systems were making employment decisions, triggering the EU AI Act's high-risk classification requirements, which carry mandatory conformity assessments and regulatory filing obligations. Three more systems were making financial decisions that triggered additional sector-specific regulations the company had not identified.

The CTO asked what it would cost to get compliant. The auditor's answer: "We need to figure out what you're actually running first." The company had built its AI portfolio organically, one system at a time, without governance infrastructure. The individual systems were technically sound - well-engineered, accurate, monitored for drift. But as a portfolio, they represented significant legal, reputational, and security exposure that no single person at the company had a complete picture of.

The investigation took four months and cost more than the entire annual AI engineering budget. Several systems were taken offline pending compliance review. Two regulatory inquiries followed.

This is what AI security governance prevents: not bad AI systems, but AI systems deployed without accountability infrastructure.


Why Governance Matters Now

AI systems have three properties that make them harder to govern than traditional software:

Probabilistic behavior: Traditional software does exactly what it is programmed to do - test cases cover the behavioral space. AI systems have emergent behaviors that cannot be fully specified in advance. A model that performs well on your evaluation suite may behave unexpectedly in production on edge-case inputs that were not in the test distribution.

Data dependency: The model's behavior is determined not just by its code but by its training data. This makes data provenance a security concern, not just a quality concern. Who contributed the training data? Under what terms? What biases did it contain? These questions have regulatory and liability implications that code-based systems do not.

Rapid capability growth: Model capabilities are increasing faster than governance frameworks can adapt. A model capability that was academic research 18 months ago may be in a production decision-making system today, with regulations still being drafted.

Governance DimensionTraditional SoftwareAI Systems
Behavioral coverageDeterministic - test suite can cover most casesProbabilistic - distribution shift causes unexpected behavior
Code reviewPeer review of logic ensures expected behaviorNo equivalent review for model weights
Behavior changeOnly with code deploymentCan drift with data changes, fine-tuning, model updates
OwnershipClear - author wrote itTraining data provenance may be complex
Regulatory frameworkSector-specific regulations (finance, health)Cross-sector AI-specific regulations emerging
Incident responseKnown failure modes, rollback = revert codeUnknown failure modes, rollback may require retraining
DocumentationAPI contracts, code commentsModel cards, datasheets, system cards
Third-party dependencyLibrary dependencies auditableThird-party model APIs are black boxes

The Governance Stack

AI governance operates at four interdependent levels. Policy without procedure is aspirational. Procedure without technical controls is unenforceable. Technical controls without policy are uncoordinated. All four must work together:

The most common governance failure is having Level 1 (a published responsible AI policy) without Level 4 (technical controls that enforce it). Policies that cannot be technically audited or monitored are not governance - they are PR.


Risk Classification Framework

Not all AI systems carry the same risk. A tiered framework matches governance overhead to actual risk, so high-risk systems get comprehensive controls without drowning low-risk tools in overhead:

from dataclasses import dataclass, field
from enum import Enum

class RiskTier(Enum):
CRITICAL = "critical" # Tier 1: Life safety, high-volume financial, vulnerable populations
HIGH = "high" # Tier 2: Employment, credit, significant financial decisions
MEDIUM = "medium" # Tier 3: Customer-facing, PII processing, reputational risk
LOW = "low" # Tier 4: Internal tools, low-impact decisions, non-sensitive data

@dataclass
class RiskDimension:
name: str
value: str
score: int
rationale: str

@dataclass
class AISystemRiskProfile:
"""Comprehensive risk profile for an AI system deployment."""
system_id: str
system_name: str
description: str
owner_team: str
deployment_date: str

# Risk dimensions (each scored independently)
decision_impact: str # "life_safety", "employment", "financial", "access_to_services", "reputational", "informational"
human_oversight: str # "none", "partial", "full"
data_sensitivity: str # "special_category", "highly_sensitive", "sensitive", "non_sensitive"
user_vulnerability: str # "vulnerable_populations", "general_public", "professionals"
deployment_scale: str # "millions", "hundreds_of_thousands", "thousands", "internal"
decision_reversibility: str # "irreversible", "difficult_to_reverse", "reversible"
regulatory_scope: list[str] # Which regulations apply

# Computed at init
risk_tier: RiskTier = field(init=False)
risk_score: int = field(init=False)
required_controls: list[str] = field(init=False)

def __post_init__(self):
self.risk_score = self._compute_risk_score()
self.risk_tier = self._compute_risk_tier()
self.required_controls = self._get_required_controls()

def _compute_risk_score(self) -> int:
score = 0

impact_scores = {
"life_safety": 5, "employment": 4, "financial": 3,
"access_to_services": 3, "reputational": 2, "informational": 1
}
score += impact_scores.get(self.decision_impact, 1)

oversight_scores = {"none": 4, "partial": 2, "full": 1}
score += oversight_scores.get(self.human_oversight, 1)

sensitivity_scores = {
"special_category": 4, "highly_sensitive": 3,
"sensitive": 2, "non_sensitive": 1
}
score += sensitivity_scores.get(self.data_sensitivity, 1)

vulnerability_scores = {
"vulnerable_populations": 4, "general_public": 2, "professionals": 1
}
score += vulnerability_scores.get(self.user_vulnerability, 1)

scale_scores = {
"millions": 3, "hundreds_of_thousands": 2, "thousands": 1, "internal": 0
}
score += scale_scores.get(self.deployment_scale, 0)

reversibility_scores = {
"irreversible": 3, "difficult_to_reverse": 2, "reversible": 1
}
score += reversibility_scores.get(self.decision_reversibility, 1)

return score

def _compute_risk_tier(self) -> RiskTier:
if self.risk_score >= 17:
return RiskTier.CRITICAL
elif self.risk_score >= 12:
return RiskTier.HIGH
elif self.risk_score >= 7:
return RiskTier.MEDIUM
else:
return RiskTier.LOW

def _get_required_controls(self) -> list[str]:
controls_by_tier = {
RiskTier.CRITICAL: [
"independent_security_assessment",
"privacy_impact_assessment",
"bias_audit_third_party",
"red_team_engagement",
"continuous_monitoring",
"mandatory_human_review_of_decisions",
"explainability_capability",
"board_or_executive_level_approval",
"regulatory_filing_or_notification",
"quarterly_review",
"incident_response_plan",
"rollback_capability",
"stakeholder_impact_assessment",
"ongoing_bias_monitoring",
"appeal_mechanism_for_affected_persons"
],
RiskTier.HIGH: [
"security_assessment",
"privacy_impact_assessment",
"bias_testing",
"red_team_engagement",
"production_monitoring",
"human_escalation_path",
"model_card",
"ciso_approval",
"incident_response_plan",
"annual_review",
"rollback_capability"
],
RiskTier.MEDIUM: [
"security_checklist",
"privacy_review",
"basic_bias_testing",
"production_monitoring",
"model_card",
"engineering_manager_approval",
"bi_annual_review"
],
RiskTier.LOW: [
"security_checklist",
"basic_documentation",
"annual_review"
]
}
return controls_by_tier.get(self.risk_tier, [])

def generate_summary(self) -> str:
return f"""
AI System Risk Assessment
=========================
System: {self.system_name} ({self.system_id})
Owner: {self.owner_team}
Risk Tier: {self.risk_tier.value.upper()} (score: {self.risk_score}/24)
Regulatory Scope: {', '.join(self.regulatory_scope) or 'None identified'}

Required Controls ({len(self.required_controls)} total):
{chr(10).join(f' - {c}' for c in self.required_controls)}
"""


# Example applications
def classify_common_ai_systems() -> list[AISystemRiskProfile]:
"""Classify typical AI systems by risk tier."""
return [
AISystemRiskProfile(
system_id="SYS-001",
system_name="Hiring Resume Screener",
description="AI ranks and filters job applicants based on resume content",
owner_team="People Ops",
deployment_date="2024-03",
decision_impact="employment",
human_oversight="partial",
data_sensitivity="sensitive",
user_vulnerability="general_public",
deployment_scale="thousands",
decision_reversibility="difficult_to_reverse",
regulatory_scope=["EU_AI_Act", "GDPR", "EEOC"]
),
AISystemRiskProfile(
system_id="SYS-002",
system_name="Customer Service Chatbot",
description="LLM chatbot for billing and account questions",
owner_team="Customer Success",
deployment_date="2024-06",
decision_impact="financial",
human_oversight="partial",
data_sensitivity="sensitive",
user_vulnerability="general_public",
deployment_scale="thousands",
decision_reversibility="reversible",
regulatory_scope=["GDPR", "CCPA"]
),
AISystemRiskProfile(
system_id="SYS-003",
system_name="Internal Document Summarizer",
description="Summarizes internal meeting notes and reports",
owner_team="Engineering",
deployment_date="2024-09",
decision_impact="informational",
human_oversight="full",
data_sensitivity="sensitive",
user_vulnerability="professionals",
deployment_scale="internal",
decision_reversibility="reversible",
regulatory_scope=[]
),
]

The key discipline is applying the framework consistently before deployment, not after something goes wrong. Most AI governance failures involve systems that were never classified at all - they were deployed as "just a simple tool" and grew in scope without triggering a re-assessment.


Regulatory Compliance Framework

The regulatory landscape for AI is evolving rapidly. Rather than tracking each regulation separately, build a unified compliance infrastructure that maps requirements to controls:

from dataclasses import dataclass, field

@dataclass
class ComplianceRequirement:
regulation: str
article: str
requirement_summary: str
implementation_approach: str
evidence_required: list[str]
applies_when: str
applies_to_tiers: list[str]

EU_AI_ACT_REQUIREMENTS = [
ComplianceRequirement(
regulation="EU AI Act",
article="Article 9",
requirement_summary="Risk Management System",
implementation_approach="Documented risk assessment: identify, estimate, evaluate, mitigate. Review at each lifecycle gate.",
evidence_required=[
"Risk management procedure document",
"Risk assessment record per system",
"Residual risk acceptance with sign-off",
"Annual risk review records"
],
applies_when="High-risk AI system (Annex III)",
applies_to_tiers=["critical", "high"]
),
ComplianceRequirement(
regulation="EU AI Act",
article="Article 10",
requirement_summary="Data and Data Governance",
implementation_approach="Training data documented for relevance, representativeness, completeness, privacy protections, and bias testing",
evidence_required=[
"Training data description",
"Bias testing results disaggregated by demographic",
"Data provenance records",
"Data governance policy"
],
applies_when="High-risk AI system",
applies_to_tiers=["critical", "high"]
),
ComplianceRequirement(
regulation="EU AI Act",
article="Article 11",
requirement_summary="Technical Documentation",
implementation_approach="Pre-market documentation covering purpose, architecture, performance, known limitations",
evidence_required=[
"Model card or system card",
"Architecture documentation",
"Performance benchmark results",
"Known limitations and failure modes"
],
applies_when="High-risk AI system",
applies_to_tiers=["critical", "high"]
),
ComplianceRequirement(
regulation="EU AI Act",
article="Article 13",
requirement_summary="Transparency and User Information",
implementation_approach="User-facing disclosure of AI use; explanation capability for consequential decisions",
evidence_required=[
"User-facing AI disclosure",
"Explanation mechanism documentation",
"User rights documentation"
],
applies_when="High-risk AI system",
applies_to_tiers=["critical", "high"]
),
ComplianceRequirement(
regulation="EU AI Act",
article="Article 14",
requirement_summary="Human Oversight",
implementation_approach="Effective human oversight capability throughout the deployment period; override capability documented",
evidence_required=[
"Human oversight procedure",
"Override mechanism documentation",
"Operator training records"
],
applies_when="High-risk AI system",
applies_to_tiers=["critical", "high"]
),
ComplianceRequirement(
regulation="EU AI Act",
article="Article 72",
requirement_summary="Post-Market Monitoring",
implementation_approach="Ongoing monitoring of performance, bias, and safety in production; incident reporting to national authority",
evidence_required=[
"Monitoring system documentation",
"Performance reports",
"Incident log",
"Serious incident reports filed with authority"
],
applies_when="High-risk AI system",
applies_to_tiers=["critical", "high"]
),
]

NIST_AI_RMF_REQUIREMENTS = [
ComplianceRequirement(
regulation="NIST AI RMF",
article="GOVERN 1.1",
requirement_summary="Policies, processes, and procedures for AI risk management established",
implementation_approach="AI governance policy document with roles, responsibilities, and approval processes",
evidence_required=["AI governance policy", "Role assignment", "Process documentation"],
applies_when="Any AI system",
applies_to_tiers=["critical", "high", "medium", "low"]
),
ComplianceRequirement(
regulation="NIST AI RMF",
article="MAP 1.1",
requirement_summary="Context established and understood",
implementation_approach="Document intended purpose, deployment context, expected users, and affected stakeholders",
evidence_required=["System card", "Use case documentation", "Stakeholder analysis"],
applies_when="Any AI system",
applies_to_tiers=["critical", "high", "medium", "low"]
),
ComplianceRequirement(
regulation="NIST AI RMF",
article="MEASURE 2.5",
requirement_summary="AI system to be deployed reflects its intended purpose",
implementation_approach="Pre-deployment testing against intended use cases; adversarial testing for misuse",
evidence_required=["Test results", "Red team report", "Adversarial robustness results"],
applies_when="Any AI system",
applies_to_tiers=["critical", "high", "medium"]
),
]


def assess_regulatory_compliance(
system: AISystemRiskProfile
) -> dict:
"""
Automated compliance gap assessment for a given AI system.
Maps system risk tier and regulatory scope to specific requirements.
"""
all_requirements = EU_AI_ACT_REQUIREMENTS + NIST_AI_RMF_REQUIREMENTS

applicable = [
req for req in all_requirements
if system.risk_tier.value in req.applies_to_tiers
and (
req.regulation.replace(" ", "_") in system.regulatory_scope
or req.regulation == "NIST AI RMF" # NIST is voluntary best practice
)
]

# In practice: check which required controls have evidence in the registry
controls_with_evidence = set(system.required_controls)

gaps = []
met = []
for req in applicable:
# Simplified: check if documentation-related controls exist
has_implementation = any(
ctrl in controls_with_evidence
for ctrl in [
"model_card", "privacy_impact_assessment", "security_assessment",
"bias_audit_third_party", "bias_testing", "incident_response_plan",
"documentation", "basic_documentation"
]
)
if has_implementation:
met.append({"regulation": req.regulation, "article": req.article, "status": "met"})
else:
gaps.append({
"regulation": req.regulation,
"article": req.article,
"requirement": req.requirement_summary,
"evidence_needed": req.evidence_required,
"status": "gap"
})

return {
"system_id": system.system_id,
"risk_tier": system.risk_tier.value,
"applicable_requirements": len(applicable),
"requirements_met": len(met),
"gaps": gaps,
"gap_count": len(gaps),
"compliance_percentage": round(len(met) / len(applicable) * 100 if applicable else 100, 1),
"compliant": len(gaps) == 0
}

Regulatory Landscape Overview


AI System Lifecycle Governance

Every AI system should pass through mandatory governance checkpoints. Gates that are optional in practice are not gates - they are suggestions:

from enum import Enum
from dataclasses import dataclass, field
from datetime import datetime

class LifecycleStage(Enum):
CONCEPTION = "conception"
DEVELOPMENT = "development"
PRE_DEPLOYMENT = "pre_deployment"
PRODUCTION = "production"
DEPRECATED = "deprecated"
DECOMMISSIONED = "decommissioned"

@dataclass
class GovernanceGate:
"""A mandatory checkpoint in the AI system lifecycle."""
name: str
from_stage: LifecycleStage
to_stage: LifecycleStage
required_artifacts: list[str]
required_approvals: list[str]
required_tests: list[str]
applies_to_tiers: list[str]
blocking: bool = True # If True, system cannot proceed without passing

@dataclass
class GateValidationResult:
gate_name: str
system_id: str
can_proceed: bool
missing_artifacts: list[str]
missing_approvals: list[str]
failed_tests: list[str]
recommendation: str
validated_at: str


LIFECYCLE_GATES = [
GovernanceGate(
name="Development Authorization",
from_stage=LifecycleStage.CONCEPTION,
to_stage=LifecycleStage.DEVELOPMENT,
required_artifacts=[
"System description (1-page)",
"Risk tier classification",
"Intended use and prohibited use statement",
"Data governance plan",
"Privacy impact assessment (draft)"
],
required_approvals=["AI governance committee"],
required_tests=[],
applies_to_tiers=["critical", "high", "medium"],
blocking=True
),
GovernanceGate(
name="Pre-Deployment Security Review",
from_stage=LifecycleStage.DEVELOPMENT,
to_stage=LifecycleStage.PRE_DEPLOYMENT,
required_artifacts=[
"Model card (final)",
"Security assessment report",
"Bias evaluation results (disaggregated)",
"Red team report",
"Incident response plan",
"Privacy impact assessment (final)",
"Data lineage documentation"
],
required_approvals=["CISO", "Legal", "Product owner"],
required_tests=[
"Security penetration test",
"Adversarial robustness evaluation",
"Bias benchmark suite",
"Out-of-distribution behavior testing"
],
applies_to_tiers=["critical", "high"],
blocking=True
),
GovernanceGate(
name="Production Deployment",
from_stage=LifecycleStage.PRE_DEPLOYMENT,
to_stage=LifecycleStage.PRODUCTION,
required_artifacts=[
"Deployment runbook",
"Monitoring and alerting setup",
"Rollback procedure",
"On-call playbook for AI-specific incidents"
],
required_approvals=["Engineering manager", "Security lead"],
required_tests=[
"Smoke test suite passing",
"Shadow mode validation (if applicable)"
],
applies_to_tiers=["critical", "high", "medium"],
blocking=True
),
GovernanceGate(
name="Annual Production Review",
from_stage=LifecycleStage.PRODUCTION,
to_stage=LifecycleStage.PRODUCTION, # Stay in production or trigger remediation
required_artifacts=[
"Annual performance report",
"Bias drift assessment",
"Security incident log review",
"Model card update"
],
required_approvals=["Product owner", "AI safety lead"],
required_tests=[
"Re-run bias benchmark suite",
"Adversarial robustness re-evaluation"
],
applies_to_tiers=["critical", "high"],
blocking=False # Failure triggers remediation plan, not shutdown
),
GovernanceGate(
name="Deprecation",
from_stage=LifecycleStage.PRODUCTION,
to_stage=LifecycleStage.DEPRECATED,
required_artifacts=[
"Deprecation notice to users (if user-facing)",
"Data retention and deletion plan",
"Handoff or migration documentation"
],
required_approvals=["Product owner"],
required_tests=[],
applies_to_tiers=["critical", "high", "medium", "low"],
blocking=False
),
]

def validate_lifecycle_gate(
system: AISystemRiskProfile,
gate: GovernanceGate,
submitted_artifacts: list[str],
obtained_approvals: list[str],
passed_tests: list[str]
) -> GateValidationResult:
"""
Validate whether a system meets all requirements for a lifecycle gate.
Returns detailed gap analysis for remediation.
"""
# Only apply gate requirements relevant to this system's risk tier
if system.risk_tier.value not in gate.applies_to_tiers:
return GateValidationResult(
gate_name=gate.name,
system_id=system.system_id,
can_proceed=True,
missing_artifacts=[],
missing_approvals=[],
failed_tests=[],
recommendation="APPROVED (gate not required for this risk tier)",
validated_at=datetime.utcnow().isoformat()
)

missing_artifacts = [a for a in gate.required_artifacts if a not in submitted_artifacts]
missing_approvals = [a for a in gate.required_approvals if a not in obtained_approvals]
failed_tests = [t for t in gate.required_tests if t not in passed_tests]

can_proceed = not (missing_artifacts or missing_approvals or failed_tests)

if can_proceed:
recommendation = "APPROVED"
elif gate.blocking:
recommendation = "BLOCKED - resolve all gaps before proceeding"
else:
recommendation = "PROCEED WITH REMEDIATION PLAN"

return GateValidationResult(
gate_name=gate.name,
system_id=system.system_id,
can_proceed=can_proceed,
missing_artifacts=missing_artifacts,
missing_approvals=missing_approvals,
failed_tests=failed_tests,
recommendation=recommendation,
validated_at=datetime.utcnow().isoformat()
)

Incident Response for AI Systems

AI incidents require specialized incident response that differs from traditional security incident response in several important ways: they may require model rollback (not just code rollback), they may involve training data investigation, they may have privacy implications from model memorization, and they may trigger regulatory notification windows.

import anthropic
from dataclasses import dataclass, field
from enum import Enum
from datetime import datetime

client = anthropic.Anthropic()

class AIIncidentType(Enum):
SAFETY_VIOLATION = "safety_violation" # Harmful content, unsafe recommendations
SECURITY_BREACH = "security_breach" # Jailbreak, extraction, data exfiltration
BIAS_HARM = "bias_harm" # Discriminatory outputs at scale
PRIVACY_VIOLATION = "privacy_violation" # PII exposure, training data memorization
PERFORMANCE_REGRESSION = "regression" # Accuracy degradation, quality drop
DATA_POISONING = "data_poisoning" # Training or RAG corpus compromised
RAG_INJECTION = "rag_injection" # Indirect prompt injection via retrieval
AVAILABILITY = "availability" # System unavailable or severely degraded

@dataclass
class IncidentResponseMetrics:
"""Track response time metrics for SLA compliance."""
incident_id: str
discovered_at: datetime
triage_completed_at: datetime | None = None
containment_achieved_at: datetime | None = None
notification_sent_at: datetime | None = None
resolved_at: datetime | None = None

@property
def time_to_triage_minutes(self) -> float | None:
if self.triage_completed_at:
return (self.triage_completed_at - self.discovered_at).total_seconds() / 60
return None

@property
def time_to_containment_hours(self) -> float | None:
if self.containment_achieved_at:
return (self.containment_achieved_at - self.discovered_at).total_seconds() / 3600
return None

@property
def gdpr_notification_hours_remaining(self) -> float | None:
"""GDPR requires notification within 72 hours of discovery."""
if self.notification_sent_at:
return 0.0
elapsed = (datetime.utcnow() - self.discovered_at).total_seconds() / 3600
return max(72.0 - elapsed, 0.0)


@dataclass
class AIIncident:
"""An AI-specific security or safety incident."""
id: str
type: AIIncidentType
system_id: str
discovered_at: datetime
description: str
affected_users_estimate: int
severity: str # "critical", "high", "medium", "low"
eu_residents_affected: bool = False # Triggers GDPR 72-hour window
personal_data_exposed: bool = False
regulatory_notification_required: bool = False
root_cause: str = ""
attack_vector: str = ""


class AIIncidentResponsePlan:
"""
Structured incident response for AI-specific incidents.

Response time SLAs by severity:
- Critical: Triage in 15 min, containment in 2 hours, leadership notified in 30 min
- High: Triage in 1 hour, containment in 8 hours
- Medium: Triage in 4 hours, containment in 24 hours
- Low: Triage in next business day

AI-specific differences from traditional IR:
- "Rollback" may mean model version rollback (not just code revert)
- Training data investigation may be required
- Privacy violations may include memorized training data (not just breach)
- Regulatory windows: GDPR 72h, EU AI Act serious incident reporting
"""

SLA_MINUTES = {
"critical": {"triage": 15, "containment": 120, "leadership_notify": 30},
"high": {"triage": 60, "containment": 480, "leadership_notify": 120},
"medium": {"triage": 240, "containment": 1440, "leadership_notify": 480},
"low": {"triage": 1440, "containment": 4320, "leadership_notify": 2880},
}

def triage(self, incident: AIIncident) -> dict:
"""Initial triage: determine severity, immediate actions, escalation."""
sla = self.SLA_MINUTES.get(incident.severity, self.SLA_MINUTES["medium"])
immediate_actions = []
escalation_path = []

# Severity-based escalation
if incident.severity == "critical":
escalation_path = [
f"Page AI Security on-call NOW (SLA: {sla['triage']} min for triage)",
f"Notify CISO within {sla['leadership_notify']} minutes",
"Open war room - all hands on incident",
"Assess: shutdown or traffic diversion needed immediately?"
]
elif incident.severity == "high":
escalation_path = [
f"Alert AI Security team (SLA: {sla['triage']} min for triage)",
f"Notify engineering manager within {sla['leadership_notify']} minutes",
"Begin impact assessment"
]

# Incident-type-specific actions
type_actions = {
AIIncidentType.SAFETY_VIOLATION: [
"Preserve all violating input/output pairs (do not delete)",
"Determine scope: isolated edge case or systematic failure?",
"Check if system prompt was bypassed or model failed",
"Consider emergency system prompt patch as temporary containment"
],
AIIncidentType.SECURITY_BREACH: [
"Identify attack vector immediately",
"Revoke any compromised credentials or API keys",
"Enable enhanced input validation as emergency measure",
"Preserve attack inputs/patterns for forensics"
],
AIIncidentType.DATA_POISONING: [
"Identify and quarantine suspected poisoned data",
"Determine contamination scope",
"Assess whether model retraining is required",
"Preserve chain of custody for forensic investigation"
],
AIIncidentType.PRIVACY_VIOLATION: [
"GDPR 72-HOUR NOTIFICATION WINDOW STARTS NOW if EU residents affected",
"Identify what personal data was exposed and to whom",
"Determine affected individuals for notification",
"Preserve evidence for regulatory response"
],
AIIncidentType.BIAS_HARM: [
"Identify affected demographic groups",
"Quantify scale of biased outputs",
"Document for regulatory response and user communications",
"Assess whether decisions based on biased output need remediation"
],
AIIncidentType.RAG_INJECTION: [
"Identify the poisoned document(s) in the knowledge base",
"Query audit log for all retrievals of affected document(s)",
"Identify all users whose sessions retrieved affected documents",
"Quarantine poisoned documents immediately"
],
}

immediate_actions = escalation_path + type_actions.get(incident.type, [])

return {
"severity": incident.severity,
"sla_minutes": sla,
"immediate_actions": immediate_actions,
"regulatory_clock_started": incident.eu_residents_affected and incident.personal_data_exposed,
"gdpr_deadline": "72 hours from discovery" if incident.eu_residents_affected else "N/A"
}

def determine_containment(self, incident: AIIncident) -> list[str]:
"""Select containment strategy based on incident type."""
strategies = {
AIIncidentType.SAFETY_VIOLATION: [
"Emergency system prompt patch with explicit constraints",
"Activate keyword blocklist for identified harmful patterns",
"Route matching query patterns to human review",
"If systematic: consider full system shutdown pending investigation"
],
AIIncidentType.SECURITY_BREACH: [
"Revoke compromised API keys and credentials",
"Enable enhanced input validation and rate limiting",
"Deploy query pattern blocking for identified attack vectors",
"Rotate all relevant secrets"
],
AIIncidentType.DATA_POISONING: [
"Remove identified poisoned documents from RAG corpus",
"Rebuild vector index from verified-clean snapshot",
"If training data poisoned: roll back to previous model version",
"Audit all documents from compromised source or integration"
],
AIIncidentType.PRIVACY_VIOLATION: [
"Identify and disable queries that expose sensitive data",
"Deploy emergency PII pattern detection on outputs",
"Consider model version rollback if memorization is widespread",
"Implement emergency output scanning and filtering"
],
AIIncidentType.BIAS_HARM: [
"Suspend affected decision pipeline pending bias audit",
"Implement manual review for all affected decision types",
"Roll back to previous model version if current version introduced bias",
"Notify affected users of potential remediation"
],
}
return strategies.get(incident.type, ["Assess scope before containment - consult AI security team"])

def generate_notification(
self,
incident: AIIncident,
audience: str
) -> str:
"""Generate regulatory/user/leadership notification using Claude."""
audience_requirements = {
"regulators": "Include: nature of incident, scope, immediate actions taken, remediation plan, timeline, GDPR Article 33 fields if applicable (nature of breach, categories and number of affected individuals, DPO contact, likely consequences, measures taken or proposed)",
"affected_users": "Include: what happened in plain language, what data may have been affected, what actions they should take, how to get support, your rights (if EU residents: right to lodge complaint with supervisory authority)",
"leadership": "Include: business impact, reputational risk, regulatory risk, decisions needed from leadership, resourcing required, timeline"
}

prompt = f"""Draft a professional notification for the following audience regarding an AI system security incident.

Audience: {audience}
Requirements for this audience: {audience_requirements.get(audience, "Professional, factual, clear")}

Incident details:
- Incident ID: {incident.id}
- Type: {incident.type.value}
- System: {incident.system_id}
- Severity: {incident.severity}
- Affected users (estimate): {incident.affected_users_estimate}
- EU residents affected: {incident.eu_residents_affected}
- Personal data exposed: {incident.personal_data_exposed}
- Discovery time: {incident.discovered_at.isoformat()}
- Description: {incident.description}

Write the notification. Be clear, factual, and appropriately brief. For regulators, be comprehensive. For users, prioritize actionability. For leadership, prioritize business decisions needed."""

response = client.messages.create(
model="claude-opus-4-6",
max_tokens=1000,
messages=[{"role": "user", "content": prompt}]
)
return response.content[0].text

Incident Response SLA Table

SeverityTriage SLAContainment SLALeadership NotifyRegulatory Window
Critical15 minutes2 hours30 minutesImmediate assessment
High1 hour8 hours2 hoursWithin 24 hours
Medium4 hours24 hours8 hoursNext business day
LowNext business day3 business daysWeekly reportN/A
Any with EU PII---GDPR 72-hour window

Model Cards as Governance Artifacts

Model cards serve multiple governance purposes simultaneously: they are the primary artifact in deployment approval gates, the document provided to regulators, the communication to enterprise customers, and the living record of system properties through versions:

from dataclasses import dataclass, field

@dataclass
class ModelCard:
"""
Structured model documentation (Mitchell et al. 2019, extended for governance).

Serves as:
- Primary artifact for deployment gate approval
- Regulatory submission document (EU AI Act Article 11)
- Customer-facing technical disclosure
- Living record updated with each model version
"""
# Identity and versioning
model_id: str
model_name: str
version: str
release_date: str
previous_version: str
changelog_summary: str
authors: list[str]
organization: str

# Intended use
primary_intended_uses: list[str]
primary_intended_users: list[str]
out_of_scope_uses: list[str]
prohibited_uses: list[str]

# Performance (must be disaggregated by group for high-risk systems)
overall_performance: dict[str, str] # benchmark → score
performance_by_group: dict[str, dict] # group → {benchmark: score}
performance_thresholds: dict[str, float] # benchmark → minimum acceptable score
performance_caveats: list[str]

# Technical details
base_model: str
fine_tuning_approach: str
training_data_description: str
training_data_size: str
training_data_cutoff: str
training_data_known_issues: list[str]
evaluation_data_description: str

# Limitations and risks
known_limitations: list[str]
failure_modes: list[str]
identified_biases: list[str]
safety_risks: list[str]
mitigation_measures: list[str]

# Governance metadata
risk_tier: str
required_oversight_level: str
approved_by: list[str]
approval_date: str
review_schedule: str
incident_contact: str
regulatory_compliance: list[str]

def validate_completeness(self) -> dict:
"""Check model card completeness for governance gate requirements."""
required_for_high_risk = [
"performance_by_group", # Must be non-empty for high-risk
"identified_biases",
"failure_modes",
"mitigation_measures",
"incident_contact",
"approved_by"
]

gaps = []
for field_name in required_for_high_risk:
value = getattr(self, field_name, None)
if not value:
gaps.append(f"Missing or empty: {field_name}")

# Check performance thresholds
for benchmark, threshold in self.performance_thresholds.items():
actual = self.overall_performance.get(benchmark)
if actual:
try:
actual_float = float(str(actual).strip('%'))
if actual_float < threshold:
gaps.append(
f"Performance below threshold: {benchmark} = {actual} < {threshold}"
)
except ValueError:
pass

return {
"complete": len(gaps) == 0,
"gaps": gaps,
"suitable_for_high_risk_gate": len(gaps) == 0
}

def to_markdown(self) -> str:
"""Generate Markdown model card for documentation and regulatory submission."""
groups_section = ""
for group, metrics in self.performance_by_group.items():
groups_section += f"\n**{group}**:\n"
for metric, score in metrics.items():
groups_section += f"- {metric}: {score}\n"

return f"""# Model Card: {self.model_name} v{self.version}

**Model ID**: `{self.model_id}`
**Organization**: {self.organization}
**Release Date**: {self.release_date}
**Previous Version**: {self.previous_version}
**Risk Tier**: {self.risk_tier}
**Regulatory Compliance**: {', '.join(self.regulatory_compliance)}

## Changelog
{self.changelog_summary}

## Intended Use

**Primary Uses**:
{chr(10).join(f'- {u}' for u in self.primary_intended_uses)}

**Out-of-Scope Uses**:
{chr(10).join(f'- {u}' for u in self.out_of_scope_uses)}

**Prohibited Uses**:
{chr(10).join(f'- **{u}**' for u in self.prohibited_uses)}

## Performance

**Overall**:
{chr(10).join(f'- {k}: {v}' for k, v in self.overall_performance.items())}

**By Group** (disaggregated):
{groups_section}

**Caveats**:
{chr(10).join(f'- {c}' for c in self.performance_caveats)}

## Known Limitations
{chr(10).join(f'- {l}' for l in self.known_limitations)}

## Failure Modes
{chr(10).join(f'- {f}' for f in self.failure_modes)}

## Identified Biases and Mitigations

**Biases**:
{chr(10).join(f'- {b}' for b in self.identified_biases)}

**Mitigation Measures**:
{chr(10).join(f'- {m}' for m in self.mitigation_measures)}

## Governance

| Field | Value |
|---|---|
| Risk Tier | {self.risk_tier} |
| Required Oversight | {self.required_oversight_level} |
| Approved By | {', '.join(self.approved_by)} |
| Approval Date | {self.approval_date} |
| Review Schedule | {self.review_schedule} |
| Incident Contact | {self.incident_contact} |
"""

Vendor Risk Management for AI

AI capabilities are increasingly sourced from third-party APIs (OpenAI, Anthropic, Google, Cohere) and open-source models. Each vendor relationship is a supply-chain dependency that requires structured risk assessment:

import anthropic
from dataclasses import dataclass

client = anthropic.Anthropic()

@dataclass
class AIVendorAssessment:
"""Risk assessment for an AI vendor or model provider."""
vendor_name: str
product_name: str
access_type: str # "api", "self_hosted", "fine_tuned", "licensed_weights"
data_sent: str # What user/company data is transmitted
use_case: str

# Security posture
soc2_type2_certified: bool
iso27001_certified: bool
penetration_tested: bool
bug_bounty_program: bool
data_retention_days: int # How long vendor retains input/output data
opt_out_of_training: bool # Can you prevent data use for model training

# Reliability
sla_uptime_percentage: float
has_fallback_provider: bool

# Contractual
has_dpa: bool # Data Processing Agreement
eu_data_residency: bool # Data stays in EU (for GDPR)
liability_cap: str
audit_rights: bool # Can you audit vendor's AI practices


def assess_vendor_risk(vendor: AIVendorAssessment) -> dict:
"""
Score vendor risk and generate assessment report.
Returns risk score, tier, and remediation recommendations.
"""
risk_points = 0
findings = []

# Security certifications
if not vendor.soc2_type2_certified:
risk_points += 3
findings.append({"issue": "No SOC 2 Type 2 certification", "severity": "high"})
if not vendor.penetration_tested:
risk_points += 2
findings.append({"issue": "No evidence of penetration testing", "severity": "medium"})

# Data handling
if vendor.data_retention_days > 30:
risk_points += 2
findings.append({
"issue": f"Long data retention: {vendor.data_retention_days} days",
"severity": "medium"
})
if not vendor.opt_out_of_training:
risk_points += 3
findings.append({
"issue": "Cannot opt out of training data use - data may be used to train future models",
"severity": "high"
})

# Contractual
if not vendor.has_dpa:
risk_points += 3
findings.append({
"issue": "No Data Processing Agreement - GDPR compliance at risk",
"severity": "critical"
})
if not vendor.audit_rights:
risk_points += 1
findings.append({
"issue": "No audit rights - cannot independently verify vendor claims",
"severity": "low"
})

# Reliability
if not vendor.has_fallback_provider:
risk_points += 2
findings.append({
"issue": "No fallback provider - single point of failure",
"severity": "medium"
})
if vendor.sla_uptime_percentage < 99.5:
risk_points += 1
findings.append({
"issue": f"SLA uptime {vendor.sla_uptime_percentage}% below 99.5% threshold",
"severity": "low"
})

# Determine risk tier
if risk_points >= 8:
risk_tier = "high"
recommendation = "ESCALATE - requires CISO approval and contractual remediation before use"
elif risk_points >= 4:
risk_tier = "medium"
recommendation = "CONDITIONAL - address critical and high findings before use"
else:
risk_tier = "low"
recommendation = "APPROVED - monitor annually"

critical = [f for f in findings if f["severity"] == "critical"]
high = [f for f in findings if f["severity"] == "high"]

return {
"vendor": vendor.vendor_name,
"product": vendor.product_name,
"risk_score": risk_points,
"risk_tier": risk_tier,
"recommendation": recommendation,
"critical_findings": critical,
"high_findings": high,
"all_findings": findings,
"use_approved": risk_tier == "low" or (risk_tier == "medium" and not critical)
}

Building the AI Governance Function

RACI Matrix for AI Governance

ActivityCISOAI Safety LeadDPOLegalEngineeringProduct
Risk tier classificationARCCRI
Security assessmentARIICI
Privacy impact assessmentCIA/RCCI
Bias evaluationCA/RIIRC
Deployment gate approvalACCCRC
Incident response leadARCCRI
Regulatory filingsIIA/RA/RII
Vendor DPA negotiationCIA/RA/RII
Red team programARIICI
Annual AI system reviewARCCRR

R=Responsible, A=Accountable, C=Consulted, I=Informed


Common Mistakes

:::danger Mistake 1: No Central AI System Registry Without a registry of all deployed AI systems - their risk tiers, owners, regulatory scope, and governance status - you cannot manage portfolio risk, respond to regulatory inquiries, or make informed prioritization decisions. Build the registry before writing any governance policies. It is the infrastructure everything else runs on. Minimum fields: system ID, name, owner team, deployment date, risk tier, regulatory scope, governance gate status, incident contact. :::

:::danger Mistake 2: Governance as a Pre-Launch Checkbox AI systems change after launch - models are updated, use cases expand, data drifts, regulations evolve. A system that was MEDIUM risk at launch may become HIGH risk if it is extended to new user populations or decision types. Governance must be continuous: mandatory re-assessment when use case changes, annual reviews for all high and critical systems, monitoring that surfaces when governance assumptions are violated. :::

:::warning Mistake 3: One-Size-Fits-All Governance Applying enterprise governance procedures to a low-risk internal document summarizer destroys developer productivity without reducing real risk. Tiered frameworks exist to prevent this. A Tier 4 (low-risk) system might need: a one-page system description, a basic security checklist, and an annual review. A Tier 1 (critical) system needs independent security assessment, red team engagement, third-party bias audit, board-level approval, and continuous monitoring. Match overhead to actual risk. :::

:::warning Mistake 4: No Incident Response Plan Before Deployment AI incidents are different from software bugs - they may require model rollback, training data investigation, regulatory notification within 72 hours, and mass user communications. By the time the incident occurs, it is too late to design the response. Define the incident response plan before deployment. Who makes the call to shut down the system? Who drafts the regulator notification? Where is the rollback procedure? These decisions made during a crisis take 10x longer than decisions made in advance. :::

:::tip Build Governance Into Developer Workflow The most effective governance is governance developers do not have to think about. Compliance checks in the CI/CD pipeline that fail on missing model cards. Required fields in the model registry that block deployment. Automated compliance scans against the regulatory mapping. Friction-free governance gets adopted; friction-full governance gets ignored or worked around. Spend engineering effort automating governance controls, not writing policy documents that nobody reads. :::

:::tip Automate Compliance Evidence Collection The hardest part of compliance is not knowing what to do - it is collecting evidence that you did it. Build automated evidence collection: monitoring dashboards auto-generate performance reports, scan results are automatically archived with timestamps, deployment gate approvals are recorded in an immutable audit log. When the auditor arrives with 47 questions, the answers should be queryable from a system, not reconstructed from email threads. :::


Interview Questions and Answers

Q: What is the difference between AI safety and AI security?

A: They are related but address different threat models. AI security uses an adversarial threat model: a malicious actor is trying to attack the system. Defenses include adversarial robustness, prompt injection protection, data poisoning defenses, jailbreak resistance, and model extraction protection. AI safety uses a systems-failure threat model: the AI system causes harm even without a malicious actor - through misalignment, unexpected capability, bias, or distribution shift. Safety concerns include model misalignment, emergent behaviors, bias and fairness, and failure on out-of-distribution inputs. The two overlap significantly in practice: a successful jailbreak is both a security failure (attacker bypassed controls) and a safety failure (model produced harmful output). Modern AI governance frameworks address both: NIST AI RMF covers both trustworthy AI broadly; EU AI Act requirements cover safety, security, and fundamental rights. For engineers, the practical implication is that you need both: security controls against malicious actors AND safety evaluation against benign inputs in the failure distribution.


Q: What does the EU AI Act classify as high-risk AI and what are the requirements?

A: The EU AI Act defines high-risk AI in Annex III, which includes eight categories: biometric identification, critical infrastructure, education and vocational training, employment and worker management, essential private and public services (credit scoring, emergency dispatch), law enforcement, migration and border control, and administration of justice and democratic processes. High-risk systems must: implement a documented risk management system (Article 9), ensure training data governance for relevance, representativeness, and bias (Article 10), maintain technical documentation before market placement (Article 11), log usage for post-market monitoring (Article 12), be transparent to users and deployers (Article 13), enable effective human oversight (Article 14), and achieve appropriate accuracy, robustness, and cybersecurity (Article 15). The Act also introduces GPAI provisions for general-purpose AI models above a compute threshold (10^25 FLOPs), with additional transparency and systemic risk requirements for the most powerful models. Enforcement begins in 2025-2026 with penalties up to €35M or 7% of global annual turnover.


Q: How would you build an AI incident response process from scratch?

A: Four components: (1) Incident typology - define AI-specific incident types: safety violations, jailbreaks, data poisoning, privacy violations (including training data memorization), bias harm, RAG injection, and performance regression. Each type has a different response track. (2) Detection mechanisms - user reports (support ticket escalation), automated monitoring (anomaly detection on output distributions, safety classifier on outputs, latency/error rate monitoring), red team findings, and external reports (security researchers, regulatory). (3) Response playbooks by type - for each incident type, document the first 4 hours: triage steps, containment options (emergency system prompt patch, model rollback, traffic diversion, shutdown), investigation approach, notification requirements. For privacy violations: GDPR 72-hour window starts at discovery. (4) Post-incident process - every incident generates: a root cause analysis, new test cases that would have caught the incident earlier, a policy or control update that prevents recurrence. The quality of your post-incident process determines whether incidents repeat. The most common failure: organizations have a generic IT incident response process that does not account for AI-specific options like model rollback, knowledge base quarantine, or training data investigation.


Q: What should a model card contain and how does it serve governance purposes?

A: A model card (Mitchell et al., 2019) is a structured document serving multiple governance functions simultaneously. Content: intended use cases and prohibited uses, performance benchmarks disaggregated by demographic group and input type, known limitations and failure modes, identified biases and mitigation measures, training data description and known issues, evaluation data description, and governance metadata (risk tier, approval chain, review schedule, incident contact). Governance functions: (1) Deployment gate artifact - the model card is the primary document reviewed at the Pre-Deployment Security Review gate. Missing or incomplete model cards block deployment. (2) Regulatory evidence - Article 11 of the EU AI Act requires technical documentation for high-risk systems; the model card is the primary vehicle for this. (3) Customer and partner communication - enterprise customers deploying your model need to know its limitations and appropriate uses. A model card is the structured disclosure mechanism. (4) Living record - update the model card with each significant model version. The changelog captures what changed, why, and what new testing was done. This creates an auditable history of model evolution that is critical for regulatory compliance.


Q: How do you approach AI governance in a company that has never had it before?

A: Start with inventory, not policy. Before writing governance documents, know what you are governing: audit all deployed AI systems, classify each by risk tier (the classification takes 30 minutes per system with a structured framework), identify regulatory scope (which process EU residents' data, which make high-impact decisions), and identify the most critical gaps (high-risk systems without privacy assessments, systems without incident response plans). This inventory creates the governance foundation and reveals the actual risk exposure - which is almost always different from what leadership expects. Then sequence: (1) Build the AI system registry as shared infrastructure. (2) Address the highest-risk gaps in existing systems first - not creating new governance overhead for low-risk tools. (3) Build the lifecycle governance process - approval gates for new systems prevent technical debt. (4) Automate compliance evidence collection - build dashboards and automated reports before manual processes become unscalable. (5) Train engineering teams - governance that engineers understand gets followed; governance that arrives as paperwork gets ignored. The goal is governance that scales with AI portfolio growth: lightweight for low-risk tools, comprehensive for high-risk systems, and institutionally embedded so it happens automatically rather than as a special effort.


Q: How do you evaluate and manage vendor risk for third-party AI APIs?

A: Three-phase approach: (1) Pre-procurement assessment - before signing any contract, assess: SOC 2 Type 2 certification, data retention policy (how long does the vendor retain your inputs and outputs), opt-out of training data use (can your data be used to train future model versions), data residency (where does data physically reside), DPA availability (required for GDPR), audit rights. For high-risk use cases, request a third-party security assessment. (2) Contract negotiation - key terms: DPA with EU standard contractual clauses if applicable, explicit commitment not to use customer data for training, data retention limits (90 days maximum for most use cases), SLA with financial penalties for breach, notification requirements for security incidents, right to audit. (3) Ongoing monitoring - track vendor security bulletins and incident reports, review vendor's model update announcements (model updates can change behavior in production), conduct annual vendor re-assessment, maintain a fallback provider that can be switched to within hours for business-critical API dependencies. The vendor risk assessment should be proportionate to how much critical decision-making depends on the vendor and how much sensitive data flows through their API.


Q: What is the NIST AI Risk Management Framework and how does it differ from the EU AI Act?

A: NIST AI RMF (released January 2023) is a voluntary framework from the US National Institute of Standards and Technology that provides a structured approach to managing AI risk across any organization. It is organized around four functions: GOVERN (establishing organizational AI risk management practices), MAP (categorizing AI systems and identifying risks), MEASURE (analyzing and quantifying identified risks), and MANAGE (prioritizing and implementing risk responses). Unlike regulations, the NIST AI RMF is voluntary and outcome-focused - it describes what good AI risk management looks like without prescribing specific technical implementations. The EU AI Act (enforcement begins 2025-2026) is legally binding for any AI system deployed in the EU market, regardless of where the deployer is located. It is risk-tier-based: prohibited AI practices, high-risk AI with mandatory conformity assessment, GPAI with transparency obligations, and limited-risk AI with lighter requirements. The key difference: NIST AI RMF helps organizations build good AI governance practices voluntarily; EU AI Act mandates specific requirements with legal consequences for non-compliance. In practice, a company with a mature NIST AI RMF implementation will find EU AI Act compliance significantly easier - the frameworks align well. NIST AI RMF compliance is increasingly required in US government contracts and is becoming a de-facto standard for enterprise AI procurement even outside government.


Summary

AI security governance is the organizational infrastructure that ensures AI systems are deployed responsibly - with appropriate risk assessment, security controls, regulatory compliance, documentation, and incident response capability. It is not a one-time activity but a continuous function that manages AI risk throughout the system lifecycle.

The five essential components: (1) A risk classification framework that matches governance overhead to actual risk - high-risk systems get comprehensive controls, low-risk tools get lightweight checklists. (2) A regulatory compliance program that maps EU AI Act, GDPR, NIST AI RMF, and sector-specific requirements to specific controls and evidence artifacts. (3) A lifecycle governance process with mandatory gates that prevent high-risk systems from reaching production without proper review. (4) Model cards as the primary governance artifact - living documents that serve deployment approvals, regulatory submissions, and customer communications simultaneously. (5) An incident response process adapted for AI-specific failure modes: model rollback, training data investigation, knowledge base quarantine, GDPR 72-hour notification windows.

The most common failure is treating governance as a pre-launch checkbox. The companies that navigate AI incidents, regulatory inquiries, and audit findings well are those that built continuous governance infrastructure - registries, monitoring, annual reviews, automated compliance evidence - before they needed it.

© 2026 EngineersOfAI. All rights reserved.