Incident Response

AI Incident Response Plan: Building a Playbook for Model Failures and AI Gone Wrong

March 26, 2026 Rebecca Leung
Table of Contents

TL;DR: Your incident response plan probably covers ransomware and data breaches. It almost certainly doesn’t cover your AI hallucinating loan terms, your credit model drifting into discriminatory territory, or an AI agent fabricating transaction records. Here’s how to build an AI-specific incident response playbook — with severity tiers, containment controls, kill switches, and escalation paths that actually work when models go sideways.

Your IRP Has a Blind Spot

A Richmond Fed study (November 2025) found that banks increasing AI investments by 10% saw quarterly operational losses rise by 4% — driven primarily by external fraud, client/customer problems, and system failures. The GAO’s 2025 report on AI in financial services (GAO-25-107197) explicitly flagged operational and cybersecurity risks from AI, noting that regulators themselves are still catching up to the oversight challenge.

Meanwhile, AI-related incidents rose 21% from 2024 to 2025, and more than 90% of insurance decision-makers now consider AI-driven incidents a material concern, according to Aon.

Yet most incident response plans treat “incident” as synonymous with “cyber breach.” They have playbooks for phishing, ransomware, and unauthorized access. They have nothing for the moment your lending model starts rejecting applicants based on zip code as a proxy for race, or your customer-facing chatbot invents a refund policy that doesn’t exist.

That gap is about to get expensive. The EU AI Act’s Article 73 mandates serious incident reporting for high-risk AI systems, with obligations taking effect in August 2026. If you’re operating AI in lending, credit scoring, insurance underwriting, or fraud detection, you’re in scope — and “we didn’t have a playbook” won’t fly.

Here’s how to build one.

AI Incident Types: What Your Playbook Needs to Cover

Traditional IRPs don’t account for AI-specific failure modes. Your AI incident response plan playbook needs explicit procedures for each of these:

Incident TypeWhat It Looks LikeReal-World Example
Hallucination EventModel generates false information presented as factAir Canada’s chatbot fabricated a bereavement fare policy — the airline was held liable (Feb 2024)
Bias Detection TriggerModel produces discriminatory outcomes across protected classesLending model denial rates diverge by race, age, or geography beyond statistical thresholds
Model DriftGradual performance degradation from data or concept changesCredit scoring model accuracy drops 15% over 6 months as economic conditions shift
Data PoisoningCorrupted or manipulated training data produces unreliable outputsAdversarial inputs to fraud detection model cause it to miss obvious fraud patterns
Unauthorized AI UseShadow AI deployed without governance review or approvalBusiness unit deploys a GPT-powered customer service tool without model risk review
AI Agent FailureAutonomous AI system takes unintended actionsAn AI agent tasked with managing expenses fabricated credible-sounding but false transaction details

Each type has different detection mechanisms, different containment needs, and different regulatory implications. A hallucination in an internal summary tool is a Sev-3. A hallucination in a customer-facing lending disclosure is a Sev-1 with potential UDAP exposure.

Severity Classification for AI Incidents

Not every AI incident is a five-alarm fire. But you need pre-defined tiers so the on-call team isn’t debating severity while the model is actively making bad decisions.

SeverityCriteriaResponse TimeWho Gets Notified
Sev-1: CriticalActive customer harm, regulatory violation, systemic risk, or discriminatory outputs at scaleImmediate containment; executive notification within 15 minCISO, CRO, General Counsel, CEO, Board Risk Committee
Sev-2: HighConfirmed bias in non-production validation, significant drift beyond tolerance, compliance risk identifiedContainment within 1 hour; escalation within 4 hoursModel Risk Manager, Chief Compliance Officer, business line head
Sev-3: MediumIsolated errors, minor drift within tolerance bands, single customer impactAssessment within 24 hoursModel owner, AI governance lead, compliance liaison
Sev-4: LowCosmetic issues, edge-case anomalies, documentation gapsNormal change management cycleModel owner, logged for quarterly review

Key principle: Severity is determined by impact scope and regulatory exposure, not by how embarrassing it is internally. A model that’s 2% less accurate in an internal forecasting tool is a Sev-4. That same 2% accuracy drop in a consumer lending model might be a Sev-2 if it’s pushing decisions across fair lending thresholds.

Auto-Escalation Triggers

Build these into your monitoring — if any trigger fires, severity automatically escalates regardless of initial classification:

  • Fair lending threshold breach: Any model output showing >2 standard deviation disparity across protected classes → auto-Sev-1
  • Customer-facing hallucination confirmed: Any verified false output delivered to a customer in a regulated context → auto-Sev-1
  • Regulatory inquiry received: Examiner or regulator asks about a specific model output → auto-Sev-2 minimum
  • Cumulative drift alarm: Model performance degrades beyond pre-set tolerance for 3+ consecutive monitoring cycles → auto-Sev-2

Escalation Paths and Ownership

Vague escalation paths kill response speed. Name specific roles, not departments.

Incident Response Team for AI

RoleWho Owns ItResponsibilities
AI Incident CommanderHead of Model Risk Management (or CRO at smaller firms)Declares severity, authorizes containment, coordinates response
Technical LeadSenior ML Engineer / Data Scientist who owns the modelRoot cause analysis, containment implementation, fix development
Compliance LeadChief Compliance Officer or designeeRegulatory notification assessment, fair lending analysis, documentation
Legal LeadGeneral Counsel or privacy attorneyLiability assessment, privilege considerations, regulatory communication
Business LeadBusiness line head who sponsors the AI use caseCustomer impact assessment, business continuity decisions, communication
Communications LeadCorporate communications / PRExternal messaging if incident becomes public

For smaller firms without a dedicated Model Risk Management team: The AI Incident Commander role typically falls to whoever owns your model risk policy — often the Head of Compliance or VP of Engineering. The point isn’t the title; it’s that one person has decision authority during the incident.

Containment Strategies: Kill Switches, Fallbacks, and Human Takeover

This is where most AI incident plans fall apart. They say “contain the incident” without specifying how you contain a model that’s making thousands of decisions per minute.

The AI Kill Switch

As KPMG’s AI risk lead told Business Insider (March 2026): if an agent begins to drift from its intended role, there must be a “kill switch or a fallback option where you can turn them off.” Stanford Law’s CodeX research goes further — an effective kill switch provides immediate stop capability with state capture and immutable logging, plus rollback and quarantine controls to revert changes and isolate the agent.

What a kill switch actually looks like in practice:

  1. Feature flags at the API gateway. Every AI model in production sits behind a feature flag. Flipping it routes all requests to the fallback system. No code deployment required. Response time: seconds.

  2. Circuit breakers with automatic triggers. Set thresholds for error rates, latency spikes, or output anomalies. When breached, the circuit breaker trips automatically — no human decision needed for the initial containment.

  3. Fallback routing hierarchy. Define what takes over when the AI model goes down:

    • Tier 1 fallback: Previous validated model version (last known good)
    • Tier 2 fallback: Rule-based decisioning engine
    • Tier 3 fallback: Human review queue (all decisions routed to manual reviewers)
  4. State capture on shutdown. When the kill switch fires, capture: the model’s current state, the last N inputs/outputs, all pending decisions in the queue, and system logs. This is your investigation evidence — lose it and your root cause analysis is flying blind.

Containment by Incident Type

Incident TypePrimary ContainmentFallbackRecovery Criteria
Hallucination (customer-facing)Immediate kill switch; route to human agentsDisable AI channel entirelyFull output audit + revalidation before reactivation
Bias triggerKill switch on affected decision pathRule-based model or manual reviewIndependent fair lending analysis confirms remediation
Model driftFlag for enhanced monitoring; kill switch if beyond tolerancePrevious model versionRetraining + validation against current data
Data poisoningIsolate data pipeline; halt model retrainingLast model version trained on clean dataFull data lineage audit + clean retraining
Shadow AIDisable access; block API keysN/A (system shouldn’t exist)Governance review and formal onboarding or permanent shutdown
Agent failureRevoke agent credentials at gatewayManual process executionFull behavioral audit + guardrail reinforcement

Investigation and Root Cause Analysis

After containment, you need to figure out why the model failed. AI incidents require investigation procedures that traditional IRPs don’t cover.

AI-Specific Investigation Steps

  1. Preserve the evidence. Before anyone touches the model: snapshot the production environment, capture model weights/parameters, export the decision log for the incident window, and preserve the training data pipeline state. Put a litigation hold on all AI-related logs if the incident has regulatory or legal exposure.

  2. Timeline reconstruction. When did the model’s behavior first deviate? Map the incident timeline against: recent model updates, data pipeline changes, feature engineering modifications, and infrastructure changes. Most AI incidents trace back to something that changed upstream.

  3. Output analysis. For the affected time window: How many decisions were impacted? What was the distribution of outcomes? Were specific customer segments disproportionately affected? This is the data your regulator will ask for.

  4. Data lineage review. Trace the model’s training data and real-time input data back to source. Look for: data quality degradation, schema changes in upstream feeds, population shifts in input data, and unauthorized data source additions.

  5. Model behavior forensics. Compare the model’s behavior during the incident against its validated baseline. Where exactly did outputs diverge? Which features drove the anomalous decisions? Use explainability tools (SHAP, LIME) to understand what the model was actually doing.

Post-Incident Review: Making It Stick

The post-incident review for AI failures needs to go beyond the standard “lessons learned” document that nobody reads.

What the Post-Incident Report Must Cover

  • Impact quantification: How many customers affected, dollar value of incorrect decisions, regulatory exposure created
  • Detection gap analysis: How long was the model producing bad outputs before detection? Why didn’t monitoring catch it sooner?
  • Control failure analysis: Which controls were supposed to prevent this? Why did they fail? Were they tested?
  • Monitoring enhancement plan: Specific new alerts, thresholds, or dashboards to detect this failure mode in the future
  • Model governance updates: Changes to validation procedures, approval gates, or deployment controls
  • Regulatory notification log: Document the decision to notify or not notify each applicable regulator, with supporting rationale

Regulatory Expectations

Banking regulators have been clear that existing model risk management guidance applies to AI. SR 11-7 (the Fed’s model risk management guidance, jointly issued with the OCC as Bulletin 2011-12) explicitly requires ongoing monitoring, outcome analysis, and a process for addressing model failures — all of which map directly to AI incident response.

The NIST AI Risk Management Framework (AI RMF 1.0) addresses incident response through its Govern and Manage functions, and the Generative AI Profile (NIST AI 600-1), released July 2024, extends this to GenAI-specific risks including hallucinations and content integrity failures.

For firms with EU exposure, Article 73 of the EU AI Act creates mandatory incident reporting for high-risk AI systems. Providers must report serious incidents to national market surveillance authorities — and the European Commission published a reporting template to operationalize this obligation. The high-risk obligations take effect August 2026.

The OECD AI Incidents Monitor is building a global evidence base for AI incidents — and in February 2025, the OECD published a paper on developing a common AI incident reporting framework. The direction of travel is clear: AI incident reporting will become as standardized as cyber breach reporting. Build the infrastructure now.

So What?

If you’re deploying AI in financial services — and at this point, most firms are — you need an AI incident response plan that’s as detailed and tested as your cyber IRP. Not a paragraph in your existing plan that says “AI incidents will be handled through existing processes.” A dedicated playbook with:

  • Defined incident types specific to AI failure modes
  • Pre-set severity tiers with auto-escalation triggers
  • Named owners with clear decision authority
  • Kill switches and fallback systems that are tested quarterly
  • Investigation procedures designed for model forensics
  • Post-incident reviews that drive actual control improvements

The firms that build this now — before their first serious AI incident — will spend hours on a controlled response instead of weeks on a chaotic one. The firms that wait will learn the hard way that “we’ll figure it out when it happens” is not a plan.

If you need a head start, the Incident Response & Breach Notification Kit includes customizable playbook templates, severity classification matrices, and escalation workflows you can extend for AI-specific scenarios.

Frequently Asked Questions

What AI-specific incidents should an incident response plan cover?
An AI incident response plan should cover at minimum: hallucination events (model generating false outputs presented as fact), bias detection triggers (discriminatory outcomes in lending, underwriting, or hiring), model drift (gradual performance degradation from data or concept changes), data poisoning (corrupted training data producing unreliable outputs), unauthorized AI use (shadow AI deployed without governance approval), and AI agent failures (autonomous systems taking unintended actions). Each incident type needs its own detection criteria, severity classification, and containment procedure.
How do you classify the severity of an AI incident?
AI incident severity is typically classified in four tiers based on impact scope and regulatory exposure. Critical (Sev-1) involves active customer harm, regulatory violations, or systemic risk — requiring immediate containment and executive notification within 15 minutes. High (Sev-2) covers confirmed bias, significant drift, or compliance risk — with 1-hour escalation. Medium (Sev-3) includes isolated errors or minor drift within tolerance — handled within 24 hours. Low (Sev-4) covers cosmetic issues or edge-case anomalies tracked through normal change management.
What is an AI kill switch and when should you use one?
An AI kill switch is a pre-built mechanism to immediately disable a model in production — routing decisions to a fallback system, rule-based engine, or human reviewers. You use it when an AI system is actively causing harm: producing discriminatory outputs at scale, hallucinating in customer-facing interactions, or making autonomous decisions outside its approved boundaries. The EU AI Act requires human override capability for high-risk AI systems. Implementation includes feature flags, circuit breakers at the API gateway level, and pre-configured fallback routing — all tested before you need them.
Rebecca Leung

Rebecca Leung

Rebecca Leung has 8+ years of risk and compliance experience across first and second line roles at commercial banks, asset managers, and fintechs. Former management consultant advising financial institutions on risk strategy. Founder of RiskTemplates.

Immaterial Findings ✉️

Weekly newsletter

Sharp risk & compliance insights practitioners actually read. Enforcement actions, regulatory shifts, and practical frameworks — no fluff, no filler.

Join practitioners from banks, fintechs, and asset managers. Delivered weekly.