AI Governance Program Checklist: What Regulators Actually Test
Table of Contents
TL;DR
- The May 2025 GAO report (GAO-25-107197) found regulators apply SR 11-7 to AI inconsistently — but they’re all checking the same six core areas
- Exam findings cluster in predictable places: incomplete model inventories, no pre-deployment validation records, third-party AI with no due diligence, and no consumer harm monitoring
- OCC Bulletin 2025-26 (October 2025) confirmed a risk-based approach to MRM frequency — but the substance of what you need hasn’t changed
- This checklist maps what examiners actually test to specific program components you can build and document
Most AI governance programs were built in a hurry. The business wanted to deploy an LLM. Someone in compliance drafted a policy. Someone in IT populated the first few rows of a model inventory. The board got a slide about AI.
That’s not what examiners test. And when they find the gaps — an examiner asking to see your model inventory and discovering it lists only three systems when your company uses forty-seven — you have a bigger problem than you started with.
The May 2025 GAO report on AI in financial services found that federal regulators apply existing model risk management guidance to AI “inconsistently across agencies.” What that means in practice: OCC examiners, Fed examiners, and FDIC examiners may approach AI governance reviews differently — but they’re all evaluating the same underlying program components. If you have those components documented and demonstrable, you’re in good shape regardless of which agency walks through the door.
Here’s the complete checklist.
Domain 1: Board and Senior Management Oversight
Examiners evaluate whether AI risk is governed at the appropriate organizational level — not managed below compliance, not delegated entirely to IT.
What examiners look for:
- AI risk is explicitly included in your board-level risk appetite statement or an appendix to it
- Senior management receives periodic reporting on AI risk — at minimum annually; quarterly for high-risk AI users
- A named executive owns AI risk program accountability (typically CRO, CCO, or a dedicated AI risk officer)
- Your risk committee charter or governance documents reference AI/model risk oversight
- Major AI deployments or policy changes are approved by management, not unilaterally deployed by business units
Common finding: AI is treated as a technology issue managed exclusively by IT. There is no compliance ownership, no board visibility into AI use, and no defined escalation path when an AI system behaves unexpectedly.
What it maps to: SR 11-7 Section on Governance, Policies, and Controls; OCC Bulletin 2011-12 governance pillar; the GAO’s finding that regulatory oversight gaps often stem from institutions failing to elevate AI risk to appropriate governance levels.
Domain 2: AI Model Inventory
Every AI system in production needs to be documented before an examiner asks for it. “We’re working on the inventory” is not an exam answer.
What examiners look for:
- A complete inventory of all AI systems in production, including vendor-supplied AI tools
- Each inventory entry includes: model name, use case, risk tier, business owner, validation status, and deployment date
- The inventory distinguishes between in-house models and third-party/embedded AI
- New models are added to the inventory before go-live (not retroactively)
- The inventory is reviewed and updated at least annually, with a documented review date
- Shadow AI — AI tools used by employees without formal approval — is identified through periodic surveys or technology monitoring
The vendor AI gap: The most common inventory failure isn’t missing homegrown models. It’s vendor-supplied AI. If your CRM uses an AI engine to score leads, if your AML system uses ML to flag transactions, if your customer service platform uses an LLM for response drafting — all of those go in the inventory. Examiners know that modern software is full of AI components, and an inventory that lists only custom-built models signals a CYA exercise, not a real program.
What it maps to: SR 11-7 governance requirements; OCC 2011-12 model inventory expectations; the GAO’s observation that institutions’ AI inventories frequently undercount vendor-embedded AI.
Domain 3: Pre-Deployment Review and Approval
Getting an AI system from “idea” to “production” requires a documented approval process. Examiners want to see the paper trail.
What examiners look for:
- A defined new-model approval process that requires pre-deployment review for all AI systems
- Risk tiering of proposed AI systems before deployment (High/Medium/Low)
- Pre-deployment validation documentation — at minimum, testing results and known limitations
- Compliance sign-off for AI systems used in consumer-facing decisions, credit, or compliance functions
- A defined set of required artifacts before a model goes live (the equivalent of a pre-deployment checklist)
- Change management procedures for system prompt changes, model version updates, and use case expansions
The “we just deployed it” problem: AI systems get deployed through vendor contract renewals, software updates, and developer experimentation — often without anyone in compliance knowing it happened. Your pre-deployment review process must extend to these channels, not just new formal initiatives. If a vendor updates their model and the change wasn’t reviewed by compliance, that’s a governance gap regardless of whether the update came from an outside party.
What it maps to: SR 11-7 model development and implementation pillar; OCC model risk management; NIST AI RMF MAP and MEASURE functions (pre-deployment evaluation).
Domain 4: Ongoing Monitoring
Deploying a model is not a compliance conclusion. Ongoing monitoring is the part most programs build last — and the part that produces the most post-deployment exam findings.
What examiners look for:
- Defined monitoring metrics and thresholds for each production AI system
- Monitoring frequency appropriate to risk tier (High: monthly or real-time; Medium: quarterly; Low: annual review)
- A process for detecting model drift — degradation in output quality or accuracy over time
- Defined revalidation triggers (base model updates, use case changes, output quality breaches)
- Consumer complaints analyzed for patterns that might indicate AI-related harm
- Escalation procedures when monitoring metrics exceed thresholds
The set-it-and-forget-it problem: Most AI governance failures that become exam findings aren’t at deployment — they’re in the months after deployment, when no one is watching. A credit scoring model that drifts toward disparate outcomes. A customer service LLM that starts giving inaccurate product information. A fraud detection model whose false positive rate doubles after a vendor update. All of these are monitoring failures.
What it maps to: SR 11-7 ongoing monitoring requirements; OCC’s expectation for model performance tracking; NIST AI RMF MANAGE function.
Domain 5: Third-Party AI Vendor Oversight
OCC Bulletin 2013-29 (third-party risk management) applies to AI vendors. Your TPRM program needs AI-specific due diligence.
What examiners look for:
- AI vendors are included in your vendor risk tiering methodology
- Pre-onboarding due diligence for AI vendors includes AI-specific questions (training data, bias controls, explainability, drift monitoring, incident notification)
- Contracts with AI vendors include data handling protections, audit rights, and incident notification requirements
- AI vendor performance is monitored post-deployment
- Concentration risk — overreliance on a single AI vendor for critical functions — is assessed and documented
- Vendor model updates are reviewed before deployment (you know when your vendor’s AI changes)
The concentration risk problem: Many institutions rely on one or two foundational AI providers — often the same large tech companies — for a broad range of AI functions. The GAO’s 2025 report flagged this as an emerging risk that regulators are starting to ask about. If your fraud system, your credit scoring, your customer service AI, and your compliance monitoring all run on the same provider’s infrastructure, you have systemic concentration risk that belongs in your risk appetite discussion.
What it maps to: OCC Bulletin 2013-29; FFIEC IT Handbook third-party risk guidance; GAO-25-107197 findings on AI vendor oversight gaps.
Domain 6: Consumer Protection and Fair Lending
The AI-specific exam questions that are most likely to produce real enforcement risk: does your AI create disparate impact, and are you monitoring for it?
What examiners look for:
- AI systems used in credit decisioning, pricing, or underwriting have been tested for disparate impact before deployment
- Disparate impact testing is periodic — not one-time — for models used in lending decisions
- Adverse action notices comply with FCRA/ECOA requirements for AI-assisted decisions (consumers have the right to know what factors influenced a decision)
- Explainability documentation exists for AI-driven credit decisions
- Marketing AI — models that target or segment customers — has been reviewed for fair lending implications
- AI-related consumer complaints are tracked and analyzed separately from general complaints
The adverse action notice problem: An AI system that contributes to a credit denial must still generate a compliant adverse action notice under ECOA and FCRA. Most traditional adverse action frameworks list specific factors (“too many inquiries,” “insufficient credit history”). When an LLM or a black-box ML model contributes to a decision, generating those specific factors becomes technically difficult — and regulators know it. If your AI-assisted credit decisions can’t produce ECOA-compliant adverse action notices, you have a regulatory exposure that will surface in exam.
What it maps to: ECOA/Regulation B adverse action requirements; CFPB UDAAP authority; OCC fair lending examination procedures; GAO finding that AI can amplify bias risks in lending.
The Examiner’s Quick-Check List
When an examiner walks into an AI governance review, they’re likely to ask for five things immediately. If you can produce these, you’ve answered 80% of what follows:
| What They’ll Ask For | What “Good” Looks Like |
|---|---|
| Model inventory | Complete list of all production AI, tiered, with owners and validation status |
| Most recent model validation | Documented testing results, not just “passed review” |
| New model approval example | A recent deployment with a paper trail from request through approval |
| Third-party AI vendor list | AI vendors identified and their AI-specific due diligence documented |
| AI-related complaint data | Complaints pulled, categorized for AI-related patterns, with trend analysis |
Building Your Program: The 90-Day Exam-Ready Baseline
If you’re starting from a weak position, here’s a realistic 90-day sequence:
Days 1–30: Inventory and Triage
- Complete AI model inventory including all vendor-supplied AI
- Risk-tier every model (use High/Medium/Low based on decision impact, customer exposure, regulatory applicability)
- Identify the top 3–5 High-tier models — these get attention first
Days 31–60: Documentation and Governance
- Draft or update AI governance policy naming owners and approval authorities
- Build or update pre-deployment checklist
- Review third-party AI vendors against TPRM standards; send AI-specific questionnaires to High-tier vendors
- Define monitoring metrics for High-tier models
Days 61–90: Controls and Evidence
- Validate or review documentation for top High-tier models
- Run disparate impact testing for any AI involved in credit or pricing
- Pull 90-day complaint data; analyze for AI-related patterns
- Present AI program summary to management or board
This is a baseline, not a complete program. But it produces the artifacts examiners ask for first — and it demonstrates program credibility that shapes everything else in the review.
For teams building this with templates, the AI Risk Assessment Template includes an AI model inventory, pre-deployment checklist, third-party AI questionnaire, and monitoring documentation framework aligned to SR 11-7 and OCC guidance.
Related Reading
Related Template
AI Risk Assessment Template & Guide
Comprehensive AI model governance and risk assessment templates for financial services teams.
Frequently Asked Questions
What do regulators look for when examining an AI governance program?
What did the GAO find about how regulators examine AI in financial services?
Is there a single regulatory standard for AI governance in US banking?
What are the most common AI governance exam findings?
Does the OCC's October 2025 MRM bulletin change what I need to have in place?
What is the minimum viable AI governance program for a small fintech or community bank?
Rebecca Leung
Rebecca Leung has 8+ years of risk and compliance experience across first and second line roles at commercial banks, asset managers, and fintechs. Former management consultant advising financial institutions on risk strategy. Founder of RiskTemplates.
Related Framework
AI Risk Assessment Template & Guide
Comprehensive AI model governance and risk assessment templates for financial services teams.
Keep Reading
AI and Fair Lending: UDAAP Risk in Algorithmic Decisioning
CFPB's UDAAP-as-discrimination gambit was vacated, but adverse action notice requirements still bite. Here's what AI lenders actually owe consumers in 2026.
Apr 13, 2026
AI RiskThird-Party AI Vendor Risk Assessment: Due Diligence Framework and Questionnaire
When a vendor deploys AI in the service they provide you, your institution's model risk responsibility doesn't disappear. Here's the due diligence framework, questionnaire areas, and contract provisions you need before deploying a vendor's AI.
Apr 13, 2026
AI RiskCommon Regulatory Exam Findings on AI: Top Deficiencies and How to Fix Them
These are the AI governance deficiencies regulators are actually finding in exams — incomplete model inventories, missing validation records, unmanaged vendor AI — and what to do about each one.
Apr 12, 2026
Immaterial Findings ✉️
Weekly newsletter
Sharp risk & compliance insights practitioners actually read. Enforcement actions, regulatory shifts, and practical frameworks — no fluff, no filler.
Join practitioners from banks, fintechs, and asset managers. Delivered weekly.