Common Regulatory Exam Findings on AI: Top Deficiencies and How to Fix Them
Table of Contents
TL;DR
- AI governance exam findings are clustering in eight specific areas — model inventory, scope classification, validation, documentation, vendor oversight, explainability, monitoring, and incident response
- OCC Bulletin 2025-26 (October 2025) confirmed the broader MRM guidance review is underway, but current expectations still apply — and examiners are already looking
- The May 2025 GAO report on AI in financial services found SR 11-7 applied inconsistently, but institutions are still receiving findings where programs are absent or deficient
- Most of these deficiencies have workable fixes — the challenge is doing them before your next exam, not after
You know a problem is real when regulators start writing about it before most institutions have fully solved it. That’s exactly where AI governance is right now.
The OCC’s Semiannual Risk Perspective for Spring 2025 flagged AI — particularly models built on external data or third-party platforms — as an elevated risk area requiring examiner attention. The GAO’s May 2025 report on AI in financial services documented the existing supervisory framework and found that while SR 11-7 applies to AI, institutions frequently misapply it or apply it too narrowly. OCC Bulletin 2025-26, issued in October 2025, acknowledged ongoing confusion about MRM scope for community banks — and signaled that broader MRM guidance is under review.
None of that means examiners are giving institutions a pass while they wait for updated rules. It means the gap between where institutions are and where examiners expect them to be is already visible — and generating findings.
Here are the deficiencies showing up most frequently, and what fixing them actually looks like.
Why These Eight Areas, and Why Now
Regulatory examination for AI governance doesn’t follow a special checklist. Examiners apply existing frameworks — SR 11-7, OCC Bulletin 2011-12, FFIEC IT examination procedures — to AI systems and ask: does this institution have adequate controls over these models?
The eight deficiency areas below reflect where the existing framework meets AI-specific implementation gaps. Some of them are pure execution problems — institutions that never built the program. Others reflect genuine ambiguity where AI models have characteristics that don’t fit cleanly into legacy MRM structures. Both kinds produce findings.
Finding #1: Incomplete or Missing AI Model Inventory
What examiners see: An institution says it has “a handful of AI tools” in a model inventory. The examiner identifies fifteen additional AI systems — vendor-embedded ML in the AML platform, an LLM in the CRM, a fraud scoring model in the payment processor — that weren’t included.
Why it happens: Institutions built their original model inventories around internally developed, quantitative models — credit scoring, stress testing, CECL. AI is now embedded in almost every vendor-supplied software product, and most institutions haven’t updated their inventory scope to include it.
How to fix it: Run a cross-functional inventory sweep — IT, operations, compliance, and each business unit. Ask: what software uses automated decision logic, machine learning, or AI to produce outputs that affect customers, financial risk, or regulatory compliance? Include vendor-embedded AI. Classify each by risk tier (high, medium, low) based on use case and consequence of failure. Assign an owner. Document deployment date and last validation date. Review the inventory annually, and add new systems before go-live.
The standard for “complete” is not zero gaps — it’s a documented, repeatable process that would catch new AI systems before they operate outside the governance framework.
Finding #2: AI Tools Not Classified Under SR 11-7 Scope
What examiners see: An institution uses an ML-based fraud scoring engine to block transactions in real time. It’s not in the model inventory. It’s never been validated. When asked, the team says “it’s a vendor product, not a model.”
Why it happens: SR 11-7 defines a model as a quantitative method that maps inputs to outputs to make business decisions. The “vendor product” exception doesn’t exist — the vendor built it, but the institution is using it to make decisions. OCC Bulletin 2025-26 was specifically issued to address scope ambiguity for community banks, and it doesn’t create a vendor exception.
How to fix it: Apply the SR 11-7 model definition to your vendor-supplied AI tools. If the system ingests data, runs quantitative logic, and produces an output used in a consequential decision — credit, fraud, AML, customer service routing — it’s a model. Classify it accordingly, document the basis for classification, and apply appropriate governance. Some tools (basic automation, rule-based logic with no statistical learning) may fall outside model scope — document that determination too.
Finding #3: No Independent Validation
What examiners see: An institution’s AI model was developed by the analytics team, and the “validation” was reviewed by the same analytics team. Or: the vendor provided its own validation documentation, which the institution accepted as-is.
Why it happens: Independent validation is expensive and logistically difficult for small institutions. And vendor validation documentation feels like it should be sufficient — after all, they built it.
How to fix it: SR 11-7 is explicit: validation must be performed by parties independent of model development. For vendor AI, vendor documentation supports validation but doesn’t substitute for it. Independent validation doesn’t require a separate department — it requires someone without a conflict of interest, structured to evaluate whether the model does what it claims, performs as expected on your data, and is used as intended. Options include: internal audit with quantitative capability, a second-line model risk function, or a third-party validation firm for high-risk models.
For low-risk models, a risk-based, lighter-touch validation is defensible — but document the risk-based rationale. Don’t skip validation entirely; skip depth proportionate to tier.
Finding #4: Model Documentation Doesn’t Match Actual Deployment
What examiners see: The model documentation describes the model as originally designed. The production model has been retrained twice, had features removed, and is running on different infrastructure. Nobody updated the documentation.
Why it happens: Documentation is written once, filed away, and never revisited. Model changes that would trigger re-documentation in a mature MRM program slip through without triggering anything in less mature programs.
How to fix it: Establish a model change management process with explicit documentation triggers: model retraining, feature additions or removals, threshold changes, infrastructure migrations, and use case expansions. Any material change requires documentation update and, for material changes, re-validation. For LLMs and GenAI specifically — where the model may update through provider releases — document the version in use, the update cadence, and how you evaluate changes for risk impact.
Finding #5: Third-Party AI Vendors Not in TPRM Program
What examiners see: An institution applies rigorous vendor due diligence to its core banking processor and payment rails — SOC 2 reports reviewed, contractual SLAs negotiated, annual reviews scheduled. The AI vendor providing a credit scoring model has no questionnaire, no contractual monitoring rights, and hasn’t been reviewed since onboarding.
Why it happens: AI vendors often entered through a side door — a product team purchased an AI tool without routing through vendor management. Or the vendor is a software company, and the institution didn’t recognize that using their AI product creates a model risk exposure requiring TPRM treatment.
How to fix it: AI vendors whose tools meet the SR 11-7 model definition should be in your TPRM program with AI-specific due diligence. That means: a questionnaire covering training data sourcing and bias controls, model explainability and documentation, drift monitoring capabilities, incident notification procedures, and data handling under applicable privacy laws. The contractual relationship should give you access to validation documentation and the right to audit. For the OCC’s purposes, using a vendor’s AI model in consequential decisions doesn’t transfer the governance obligation to the vendor — it stays with you.
Internal link: Third-party AI due diligence framework →
Finding #6: Explainability Requirements Not Documented
What examiners see: A credit decisioning AI is producing adverse action notices that say “decision based on model score.” The institution can’t explain which input features drove the score for any given applicant, and the model documentation provides no mechanism for doing so.
Why it happens: Many ML models — gradient boosting, neural networks, ensemble models — produce accurate predictions through processes that aren’t inherently interpretable. The pressure to deploy fast frequently outpaces the effort to build explainability into the governance framework.
How to fix it: For any AI model used in consumer-facing credit, lending, or account management decisions, document your explainability approach before deployment. For complex models, this means: identifying the top contributing features for any given decision, establishing a method for generating adverse action reason codes that satisfy ECOA and Reg B requirements, and testing whether those codes are actually meaningful to the applicant. SHAP values, LIME, and other interpretability techniques provide feature-level attribution — the documentation should explain which technique you use and how reason codes are derived. Examiners are not asking for mathematical derivations; they’re asking whether you have a documented, defensible method.
Finding #7: No Ongoing Monitoring Program
What examiners see: An AI model was validated at deployment two years ago. It is running in production. Nothing has been done since. There are no monitoring dashboards, no defined performance thresholds, no re-validation trigger criteria.
Why it happens: Validation is a defined event with a clear output — a report. Ongoing monitoring is an open-ended commitment without a natural endpoint, and it typically gets deprioritized once the model is live.
How to fix it: Establish a monitoring framework before deployment, not after. The framework should include: the performance metrics being tracked (accuracy, approval rate, denial rate, output distribution), the frequency of review (monthly for high-risk, quarterly for medium-risk), and the thresholds that trigger escalation or re-validation. For AI models, add drift monitoring — Population Stability Index (PSI) or similar statistics that detect when the model’s input data distribution has shifted from its training data. The monitoring documentation should be reviewed at least annually and updated when thresholds are adjusted.
For more on what AI monitoring frameworks should include, see the SR 11-7 for AI Systems post.
Finding #8: AI Not Included in Incident Response Planning
What examiners see: An institution has a mature cybersecurity incident response plan and a well-tested business continuity program. Neither document mentions AI model failures: what constitutes an AI “incident,” who owns response, what’s the trigger for escalation, and how do you notify affected parties?
Why it happens: Incident response plans were written before AI was operational at scale. AI failures — a model producing systematically biased outputs, an LLM generating incorrect compliance guidance, a fraud model triggering a wave of false positives — don’t fit cleanly into existing incident response runbooks.
How to fix it: Add an AI-specific incident response annex. Define what constitutes an AI incident (performance degradation below threshold, evidence of systematic bias, data quality failure affecting model inputs, unauthorized model modification, vendor AI service outage). Assign clear ownership — typically the model owner with escalation to the model risk function. Establish escalation criteria that link AI incident severity to your enterprise incident response tiers. For LLMs used in customer-facing applications, define the process for identifying and notifying customers potentially affected by model errors.
A 2025 survey found only 54% of organizations maintain incident response playbooks for AI-specific risks — which means nearly half of institutions would improvise when something goes wrong.
Prioritizing Your Fix List
If you’re looking at this list and trying to figure out where to start, prioritize by regulatory risk exposure:
| Priority | Deficiency | Risk Driver |
|---|---|---|
| 1 | AI model inventory | Foundational — everything else flows from knowing what you have |
| 2 | SR 11-7 scope classification | High-risk AI running outside MRM is the core finding |
| 3 | Vendor AI in TPRM | Third-party AI specifically flagged in OCC Spring 2025 report |
| 4 | Independent validation | SR 11-7 core requirement; gaps here produce MRAs not recommendations |
| 5 | Explainability documentation | Consumer protection and fair lending risk escalation |
| 6 | Ongoing monitoring | Required under SR 11-7; absence is a clean finding |
| 7 | Documentation accuracy | Exam credibility; inconsistency signals weak governance |
| 8 | Incident response coverage | Growing scrutiny; lower acute risk but significant gap signal |
The AI Governance Program Checklist covers the full program requirements behind each of these areas.
So What?
Exam preparation for AI governance isn’t about finding a defensible answer when a finding lands — it’s about building the program before examiners walk through the door. The eight deficiencies above are patterns, not outliers. They represent what happens when institutions deploy AI faster than they build governance.
The OCC and Federal Reserve aren’t expecting perfection. They’re expecting a risk-based program with documented methodology, clear ownership, and demonstrable evidence of execution. An incomplete model inventory with a documented remediation plan is better than a claim of completeness that falls apart under questioning.
The institutions that are cleanest in AI governance examinations share one thing: they treat it as a program, not a project. There’s no finish line — just a continuous loop of inventory, validation, monitoring, and improvement.
If your AI governance program was built in a hurry, now is the time to rebuild it properly. The AI Risk Assessment Template provides the structure to get there: model inventory, pre-deployment checklist, vendor questionnaire, and governance documentation — designed for teams that need to show progress without building a full model risk function from scratch.
Related Template
AI Risk Assessment Template & Guide
Comprehensive AI model governance and risk assessment templates for financial services teams.
Frequently Asked Questions
What are the most common AI governance exam findings regulators are citing?
Do OCC and Federal Reserve examiners actually look at AI governance specifically?
What does 'independent validation' mean for an AI model, and who can do it?
Is vendor-supplied AI subject to the same exam scrutiny as internally built models?
How quickly can I fix an incomplete AI model inventory if I just got an MRA?
What's the difference between an AI governance exam finding and an MRA?
Rebecca Leung
Rebecca Leung has 8+ years of risk and compliance experience across first and second line roles at commercial banks, asset managers, and fintechs. Former management consultant advising financial institutions on risk strategy. Founder of RiskTemplates.
Related Framework
AI Risk Assessment Template & Guide
Comprehensive AI model governance and risk assessment templates for financial services teams.
Keep Reading
AI and Fair Lending: UDAAP Risk in Algorithmic Decisioning
CFPB's UDAAP-as-discrimination gambit was vacated, but adverse action notice requirements still bite. Here's what AI lenders actually owe consumers in 2026.
Apr 13, 2026
AI RiskThird-Party AI Vendor Risk Assessment: Due Diligence Framework and Questionnaire
When a vendor deploys AI in the service they provide you, your institution's model risk responsibility doesn't disappear. Here's the due diligence framework, questionnaire areas, and contract provisions you need before deploying a vendor's AI.
Apr 13, 2026
AI RiskSR 11-7 for AI Systems: Applying Legacy Model Risk Guidance to LLMs
How to actually implement SR 11-7 for LLMs: model inventory, governance ownership, documentation standards, and validation scope for in-house and vendor AI.
Apr 12, 2026
Immaterial Findings ✉️
Weekly newsletter
Sharp risk & compliance insights practitioners actually read. Enforcement actions, regulatory shifts, and practical frameworks — no fluff, no filler.
Join practitioners from banks, fintechs, and asset managers. Delivered weekly.