AI Risk

AI Governance Program Checklist: What Regulators Actually Test

Table of Contents

TL;DR

  • The May 2025 GAO report (GAO-25-107197) found regulators apply SR 11-7 to AI inconsistently — but they’re all checking the same six core areas
  • Exam findings cluster in predictable places: incomplete model inventories, no pre-deployment validation records, third-party AI with no due diligence, and no consumer harm monitoring
  • OCC Bulletin 2025-26 (October 2025) confirmed a risk-based approach to MRM frequency — but the substance of what you need hasn’t changed
  • This checklist maps what examiners actually test to specific program components you can build and document

Most AI governance programs were built in a hurry. The business wanted to deploy an LLM. Someone in compliance drafted a policy. Someone in IT populated the first few rows of a model inventory. The board got a slide about AI.

That’s not what examiners test. And when they find the gaps — an examiner asking to see your model inventory and discovering it lists only three systems when your company uses forty-seven — you have a bigger problem than you started with.

The May 2025 GAO report on AI in financial services found that federal regulators apply existing model risk management guidance to AI “inconsistently across agencies.” What that means in practice: OCC examiners, Fed examiners, and FDIC examiners may approach AI governance reviews differently — but they’re all evaluating the same underlying program components. If you have those components documented and demonstrable, you’re in good shape regardless of which agency walks through the door.

Here’s the complete checklist.


Domain 1: Board and Senior Management Oversight

Examiners evaluate whether AI risk is governed at the appropriate organizational level — not managed below compliance, not delegated entirely to IT.

What examiners look for:

  • AI risk is explicitly included in your board-level risk appetite statement or an appendix to it
  • Senior management receives periodic reporting on AI risk — at minimum annually; quarterly for high-risk AI users
  • A named executive owns AI risk program accountability (typically CRO, CCO, or a dedicated AI risk officer)
  • Your risk committee charter or governance documents reference AI/model risk oversight
  • Major AI deployments or policy changes are approved by management, not unilaterally deployed by business units

Common finding: AI is treated as a technology issue managed exclusively by IT. There is no compliance ownership, no board visibility into AI use, and no defined escalation path when an AI system behaves unexpectedly.

What it maps to: SR 11-7 Section on Governance, Policies, and Controls; OCC Bulletin 2011-12 governance pillar; the GAO’s finding that regulatory oversight gaps often stem from institutions failing to elevate AI risk to appropriate governance levels.


Domain 2: AI Model Inventory

Every AI system in production needs to be documented before an examiner asks for it. “We’re working on the inventory” is not an exam answer.

What examiners look for:

  • A complete inventory of all AI systems in production, including vendor-supplied AI tools
  • Each inventory entry includes: model name, use case, risk tier, business owner, validation status, and deployment date
  • The inventory distinguishes between in-house models and third-party/embedded AI
  • New models are added to the inventory before go-live (not retroactively)
  • The inventory is reviewed and updated at least annually, with a documented review date
  • Shadow AI — AI tools used by employees without formal approval — is identified through periodic surveys or technology monitoring

The vendor AI gap: The most common inventory failure isn’t missing homegrown models. It’s vendor-supplied AI. If your CRM uses an AI engine to score leads, if your AML system uses ML to flag transactions, if your customer service platform uses an LLM for response drafting — all of those go in the inventory. Examiners know that modern software is full of AI components, and an inventory that lists only custom-built models signals a CYA exercise, not a real program.

What it maps to: SR 11-7 governance requirements; OCC 2011-12 model inventory expectations; the GAO’s observation that institutions’ AI inventories frequently undercount vendor-embedded AI.


Domain 3: Pre-Deployment Review and Approval

Getting an AI system from “idea” to “production” requires a documented approval process. Examiners want to see the paper trail.

What examiners look for:

  • A defined new-model approval process that requires pre-deployment review for all AI systems
  • Risk tiering of proposed AI systems before deployment (High/Medium/Low)
  • Pre-deployment validation documentation — at minimum, testing results and known limitations
  • Compliance sign-off for AI systems used in consumer-facing decisions, credit, or compliance functions
  • A defined set of required artifacts before a model goes live (the equivalent of a pre-deployment checklist)
  • Change management procedures for system prompt changes, model version updates, and use case expansions

The “we just deployed it” problem: AI systems get deployed through vendor contract renewals, software updates, and developer experimentation — often without anyone in compliance knowing it happened. Your pre-deployment review process must extend to these channels, not just new formal initiatives. If a vendor updates their model and the change wasn’t reviewed by compliance, that’s a governance gap regardless of whether the update came from an outside party.

What it maps to: SR 11-7 model development and implementation pillar; OCC model risk management; NIST AI RMF MAP and MEASURE functions (pre-deployment evaluation).


Domain 4: Ongoing Monitoring

Deploying a model is not a compliance conclusion. Ongoing monitoring is the part most programs build last — and the part that produces the most post-deployment exam findings.

What examiners look for:

  • Defined monitoring metrics and thresholds for each production AI system
  • Monitoring frequency appropriate to risk tier (High: monthly or real-time; Medium: quarterly; Low: annual review)
  • A process for detecting model drift — degradation in output quality or accuracy over time
  • Defined revalidation triggers (base model updates, use case changes, output quality breaches)
  • Consumer complaints analyzed for patterns that might indicate AI-related harm
  • Escalation procedures when monitoring metrics exceed thresholds

The set-it-and-forget-it problem: Most AI governance failures that become exam findings aren’t at deployment — they’re in the months after deployment, when no one is watching. A credit scoring model that drifts toward disparate outcomes. A customer service LLM that starts giving inaccurate product information. A fraud detection model whose false positive rate doubles after a vendor update. All of these are monitoring failures.

What it maps to: SR 11-7 ongoing monitoring requirements; OCC’s expectation for model performance tracking; NIST AI RMF MANAGE function.


Domain 5: Third-Party AI Vendor Oversight

OCC Bulletin 2013-29 (third-party risk management) applies to AI vendors. Your TPRM program needs AI-specific due diligence.

What examiners look for:

  • AI vendors are included in your vendor risk tiering methodology
  • Pre-onboarding due diligence for AI vendors includes AI-specific questions (training data, bias controls, explainability, drift monitoring, incident notification)
  • Contracts with AI vendors include data handling protections, audit rights, and incident notification requirements
  • AI vendor performance is monitored post-deployment
  • Concentration risk — overreliance on a single AI vendor for critical functions — is assessed and documented
  • Vendor model updates are reviewed before deployment (you know when your vendor’s AI changes)

The concentration risk problem: Many institutions rely on one or two foundational AI providers — often the same large tech companies — for a broad range of AI functions. The GAO’s 2025 report flagged this as an emerging risk that regulators are starting to ask about. If your fraud system, your credit scoring, your customer service AI, and your compliance monitoring all run on the same provider’s infrastructure, you have systemic concentration risk that belongs in your risk appetite discussion.

What it maps to: OCC Bulletin 2013-29; FFIEC IT Handbook third-party risk guidance; GAO-25-107197 findings on AI vendor oversight gaps.


Domain 6: Consumer Protection and Fair Lending

The AI-specific exam questions that are most likely to produce real enforcement risk: does your AI create disparate impact, and are you monitoring for it?

What examiners look for:

  • AI systems used in credit decisioning, pricing, or underwriting have been tested for disparate impact before deployment
  • Disparate impact testing is periodic — not one-time — for models used in lending decisions
  • Adverse action notices comply with FCRA/ECOA requirements for AI-assisted decisions (consumers have the right to know what factors influenced a decision)
  • Explainability documentation exists for AI-driven credit decisions
  • Marketing AI — models that target or segment customers — has been reviewed for fair lending implications
  • AI-related consumer complaints are tracked and analyzed separately from general complaints

The adverse action notice problem: An AI system that contributes to a credit denial must still generate a compliant adverse action notice under ECOA and FCRA. Most traditional adverse action frameworks list specific factors (“too many inquiries,” “insufficient credit history”). When an LLM or a black-box ML model contributes to a decision, generating those specific factors becomes technically difficult — and regulators know it. If your AI-assisted credit decisions can’t produce ECOA-compliant adverse action notices, you have a regulatory exposure that will surface in exam.

What it maps to: ECOA/Regulation B adverse action requirements; CFPB UDAAP authority; OCC fair lending examination procedures; GAO finding that AI can amplify bias risks in lending.


The Examiner’s Quick-Check List

When an examiner walks into an AI governance review, they’re likely to ask for five things immediately. If you can produce these, you’ve answered 80% of what follows:

What They’ll Ask ForWhat “Good” Looks Like
Model inventoryComplete list of all production AI, tiered, with owners and validation status
Most recent model validationDocumented testing results, not just “passed review”
New model approval exampleA recent deployment with a paper trail from request through approval
Third-party AI vendor listAI vendors identified and their AI-specific due diligence documented
AI-related complaint dataComplaints pulled, categorized for AI-related patterns, with trend analysis

Building Your Program: The 90-Day Exam-Ready Baseline

If you’re starting from a weak position, here’s a realistic 90-day sequence:

Days 1–30: Inventory and Triage

  • Complete AI model inventory including all vendor-supplied AI
  • Risk-tier every model (use High/Medium/Low based on decision impact, customer exposure, regulatory applicability)
  • Identify the top 3–5 High-tier models — these get attention first

Days 31–60: Documentation and Governance

  • Draft or update AI governance policy naming owners and approval authorities
  • Build or update pre-deployment checklist
  • Review third-party AI vendors against TPRM standards; send AI-specific questionnaires to High-tier vendors
  • Define monitoring metrics for High-tier models

Days 61–90: Controls and Evidence

  • Validate or review documentation for top High-tier models
  • Run disparate impact testing for any AI involved in credit or pricing
  • Pull 90-day complaint data; analyze for AI-related patterns
  • Present AI program summary to management or board

This is a baseline, not a complete program. But it produces the artifacts examiners ask for first — and it demonstrates program credibility that shapes everything else in the review.

For teams building this with templates, the AI Risk Assessment Template includes an AI model inventory, pre-deployment checklist, third-party AI questionnaire, and monitoring documentation framework aligned to SR 11-7 and OCC guidance.


Frequently Asked Questions

What do regulators look for when examining an AI governance program?
Regulators examine six core areas: (1) board and management oversight — is AI risk identified and governed at the board level? (2) AI model inventory — are all AI systems catalogued with risk tiers and owners? (3) pre-deployment review — was the model validated and approved before going live? (4) third-party AI oversight — does your TPRM program cover AI vendors specifically? (5) ongoing monitoring — are models being monitored post-deployment for drift and degradation? (6) fair lending and consumer protection — does your AI create disparate impact risk? Gaps in any of these areas produce exam findings.
What did the GAO find about how regulators examine AI in financial services?
The May 2025 GAO report (GAO-25-107197) found that financial regulators apply existing guidance like SR 11-7 to AI, but do so inconsistently across agencies. The GAO found that examiners vary in how they assess AI governance — some focus on model risk management documentation, others on fair lending implications, others on cybersecurity. The NCUA was specifically cited for lacking adequate AI model risk guidance. The report also noted that AI can amplify bias risks in credit and lending decisions.
Is there a single regulatory standard for AI governance in US banking?
No single standard exists — yet. The Federal Reserve and OCC apply SR 11-7 and OCC Bulletin 2011-12 to AI/ML models. The FFIEC guidance covers technology-related risk broadly. The OCC's Comptroller's Handbook addresses unfair or deceptive acts for AI-driven decisions. The NIST AI RMF is voluntary but increasingly cited in exam feedback. Institutions are expected to apply existing frameworks with AI-specific adaptations — and the regulatory landscape is evolving quickly.
What are the most common AI governance exam findings?
Based on regulatory guidance and examiner feedback patterns: (1) no AI model inventory or incomplete inventory that excludes vendor-supplied AI; (2) missing or incomplete pre-deployment validation documentation; (3) no defined risk tiering for AI models; (4) absence of ongoing monitoring metrics and thresholds; (5) third-party AI vendors not subject to the same due diligence as other vendors; (6) no process for identifying AI-related consumer complaints or fair lending risks.
Does the OCC's October 2025 MRM bulletin change what I need to have in place?
OCC Bulletin 2025-26 clarified that community banks don't need annual model validation if their risk-based assessment supports a different frequency. It signaled a broader MRM guidance review for banks of all sizes. What it doesn't change: examiners still expect a model inventory, documented governance, and ongoing monitoring. The bulletin reduces prescriptive frequency requirements, not the substance of what you need to demonstrate.
What is the minimum viable AI governance program for a small fintech or community bank?
Minimum viable: (1) an AI model inventory with all production AI systems identified, tiered, and owned; (2) a pre-deployment checklist — run before any AI system goes live; (3) a third-party AI questionnaire for vendors supplying AI tools; (4) basic ongoing monitoring — at minimum, tracking complaint patterns and periodic output quality review; (5) a documented AI policy that names who approves AI deployments. This isn't aspirational — these are the items examiners ask for.
Rebecca Leung

Rebecca Leung

Rebecca Leung has 8+ years of risk and compliance experience across first and second line roles at commercial banks, asset managers, and fintechs. Former management consultant advising financial institutions on risk strategy. Founder of RiskTemplates.

Related Framework

AI Risk Assessment Template & Guide

Comprehensive AI model governance and risk assessment templates for financial services teams.

Immaterial Findings ✉️

Weekly newsletter

Sharp risk & compliance insights practitioners actually read. Enforcement actions, regulatory shifts, and practical frameworks — no fluff, no filler.

Join practitioners from banks, fintechs, and asset managers. Delivered weekly.