AI Risk

Agentic AI Governance: The Compliance Gap Nobody's Talking About

April 20, 2026 Rebecca Leung
Table of Contents

TL;DR

  • Agentic AI—systems that plan, decide, and take autonomous actions—creates compliance gaps that SR 11-7, Reg E, and UDAAP weren’t designed to cover
  • The CFPB issued an August 2025 ANPRM specifically asking: who counts as a “representative” acting on a consumer’s behalf in an agentic context?
  • SR 11-7’s definition of a “model” is too narrow for systems that don’t produce estimates—they execute actions
  • Compliance programs need four components today: pre-authorization constraints, continuous validation protocols, decision trace logs, and UDAAP review for agent-generated communications

Your AI is making decisions right now. Calling APIs, generating customer emails, adjusting pricing parameters, flagging accounts for review. Maybe approving things. Maybe communicating things to customers that your compliance team hasn’t seen.

And if you’re being honest—your compliance program wasn’t built for this.

The gap isn’t about awareness. Risk managers know agentic AI is coming; many have been tracking it for months. The gap is structural. SR 11-7, Regulation E, UDAAP, and your current model governance framework all share a foundational assumption that turns out to be wrong for agentic systems: that a human stands between the AI’s output and the real-world consequence.

With traditional AI, a model scores a loan. A human approves it. With agentic AI, the model scores, decides, and acts—often before any human can intervene.

That difference changes everything about how compliance needs to work.

What “Agentic AI” Actually Means for Compliance

Before getting into the regulatory gaps, it’s worth being precise. “Agentic AI” isn’t just AI with more capabilities. It’s a fundamentally different operational model.

A conventional AI system produces an output. A human acts on that output. A model risk manager validates the model’s logic. An examiner reviews the output population. The feedback loop involves humans at every consequential step.

An agentic AI system is orchestrated differently:

CharacteristicTraditional ModelGenerative AI (LLM)Agentic AI
Output typePrediction or scoreText or contentDecision + action
Human roleActs on outputReviews outputOften bypassed entirely
Tool accessNoneLimitedAPIs, databases, email, payment rails
StateStatelessSession-scopedPersistent memory across sessions
Failure modeWrong predictionHallucinationUnauthorized or harmful action at scale
Validation windowScheduled reviewScheduled reviewContinuous; behavior changes between reviews

Gartner projected that 40% of financial services firms would be deploying AI agents by end of 2026. The Consumer Bankers Association’s January 2026 agentic AI white paper documented over a dozen use cases already in production across retail banking, payments, and commercial lending—including agents that send customer notifications, adjust exposure limits, and initiate collections workflows.

None of those use cases fit neatly into your current governance model.

The SR 11-7 Problem: When Validation Assumptions Break

SR 11-7—the Federal Reserve’s 2011 guidance on model risk management, jointly issued with the OCC as OCC Bulletin 2011-12—defines a model as “a quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.”

That definition assumes:

  • The model produces an estimate (not an action)
  • The system is relatively static (stable parameters between reviews)
  • Decision paths are reconstructible (validation can determine how an output was reached)

Agentic AI breaks all three assumptions.

As the GARP analysis on SR 11-7 and agentic AI laid out clearly, “material changes in behavior can occur without a formal redevelopment event”—meaning an agentic system might behave differently after weeks of interaction even though no one changed the model. Traditional periodic validation cycles miss this entirely.

The three most consequential gaps:

Gap 1: Dynamic Validation

SR 11-7’s validation framework assumes you can validate a model and then apply that validation for a review cycle. Conceptual soundness assessments, outcomes analysis, and benchmarking are all designed for systems that stay reasonably stable between reviews.

Agentic systems don’t. They adapt, learn from interactions, and may develop behavioral drift that has nothing to do with a formal model change. An agent that handles collections calls might shift toward more aggressive language patterns after months of reinforcement—without any model update triggering a revalidation event.

What this means in practice: validation for agentic systems needs to be continuous, not periodic. You need monitoring that detects behavioral drift in real time, not an annual validation cycle that catches it 11 months late.

Gap 2: Third-Party Concentration Risk

Most agentic AI deployments don’t use models built in-house. They use foundation models from OpenAI, Anthropic, Google, or Microsoft as the underlying reasoning engine, then layer proprietary business logic on top.

This creates a concentration risk that SR 11-7’s third-party model validation requirements weren’t designed to handle. When the underlying model provider updates their model—which happens continuously for cloud-hosted systems—your agent’s behavior can change without your knowledge and without triggering any internal review process.

The OCC Bulletin 2013-29 TPRM framework applies, but it wasn’t built for a scenario where the “vendor” updates the product in real time. Your vendor risk assessment needs a new section.

Gap 3: Explainability Standards

SR 11-7 “emphasizes transparency sufficient to enable effective challenge”—but provides no specific standard for what explainability looks like for a system that chains multiple decisions in a multi-step reasoning process.

If an agent makes a credit exposure decision based on six sequential reasoning steps, none of which are individually recordable in your current audit log, how do you explain that to an examiner? What does “adequate explainability” mean for a system that reasons differently every time?

Regulators haven’t answered this yet. That doesn’t mean you can wait.

Reg E and the Authorization Void

Here’s the compliance gap that could produce the largest consumer protection liability exposure.

Regulation E (EFTA) protects consumers for electronic fund transfers. The dispute framework assumes a consumer authorized a specific transaction. When something goes wrong, the consumer disputes it, and the financial institution investigates whether the transaction was authorized.

What happens when an AI agent initiates a transaction on the consumer’s behalf?

This is the exact question the CFPB is wrestling with. In August 2025, the bureau issued an Advance Notice of Proposed Rulemaking on personal financial data rights, specifically seeking comment on who can serve as a “representative” operating on a consumer’s behalf. The Center for Data Innovation’s March 2026 analysis pointed out the core problem: Reg E mentions authorization via “card, code, or other means” but “provides no framework for disputes when agents malfunction—such as ordering incorrect items or failing to recognize artificially inflated prices that humans would catch.”

The practical problem for compliance: if a consumer authorizes an AI agent to manage their finances, and the agent initiates a transfer the consumer later disputes—is that an authorized transaction? Under current Reg E, the answer is genuinely unclear.

This matters for any financial institution building:

  • Automated bill payment agents
  • Personal financial management tools that move money
  • AI-driven savings or investment rebalancing
  • Any consumer-facing agent that touches payment rails

The CFPB hasn’t finished the rulemaking. But enforcement won’t wait for rulemaking to finish. Build your authorization frameworks now.

What the authorization framework needs:

  • Explicit consumer consent to agent authority, scoped to specific action types
  • Spending limits and action category limits baked into the agent’s permission model
  • Human override availability for any transaction above a defined threshold
  • Transaction logs that clearly show agent-initiated vs. consumer-initiated activity

UDAAP Risk: When Your Agent Speaks

The CFPB has been clear that UDAAP applies to AI-generated consumer communications. “There are no exceptions to the federal consumer financial protection laws for new technologies.”

That statement should focus attention on a specific agentic AI risk that most compliance programs have completely missed: the agent is talking to your customers, and compliance hasn’t reviewed what it’s saying.

Traditional consumer communications go through a review workflow. A compliance officer reviews the draft, flags potentially deceptive language, approves the final version. That workflow assumes there’s a fixed document to review.

Agentic AI generates bespoke communications dynamically. Each message is different—different tone, different framing, different emphasis—based on the interaction context. You cannot review agentic communications the same way you review a batch-generated notice.

The UDAAP risk is real and specific:

  • An agent that handles fee complaints might describe fee structures in ways that are technically accurate but misleading in context
  • An agent that manages collections might use language calibrated for engagement that ends up being aggressive in ways that would fail UDAAP review
  • An agent that discusses account options might emphasize certain products in ways that could constitute deceptive marketing

The compliance question isn’t just “what did the agent say”—it’s “what does the agent say across 10,000 conversations, and does the aggregate pattern constitute a deceptive practice?”

What the UDAAP review process for agentic AI needs:

  • Systematic sampling of agent-generated communications (not just reviewing templates)
  • Behavioral monitoring for language drift and escalation patterns
  • Consumer complaint analysis for agent-specific issues
  • Pre-deployment testing scenarios for deception and manipulation vectors

What Your Compliance Program Is Actually Missing

Most financial institutions deploying agentic AI have some version of the following:

  • An AI use case inventory that includes their agents
  • A vendor due diligence questionnaire for the foundation model provider
  • Some version of “human-in-the-loop” for high-stakes decisions

That’s a start. It’s not a governance framework.

Here’s the specific list of what’s typically missing:

GapWhat’s MissingWhy It Matters
Pre-authorization scope definitionNo documented list of what the agent can and cannot do before deploymentCreates unlimited authority for agent errors
Continuous validation protocolPeriodic review cycle onlyMisses behavioral drift and emergent behavior
Agent action audit trailNo structured log of what the agent did and whyCannot satisfy examiner documentation requests
Reg E authorization mappingVague “consumer consent” without transaction-type scopingDispute liability exposure
UDAAP monitoring for agent communicationsTemplate review only, no dynamic content monitoringPattern-level deception risk
Circuit breaker definitionsNo defined conditions that halt agent operationsCascade failure risk
Third-party model change managementNo protocol for vendor model updatesUndetected behavioral drift from upstream changes

The Treasury’s FS AI RMF (February 2026) introduced 230 control objectives across seven domains. It’s the most comprehensive sector-specific framework available. But it was built around the same model governance assumptions as SR 11-7—it doesn’t specifically address the unique compliance obligations of systems that take autonomous action.

Building the Agentic Governance Stack: What to Do in the Next 90 Days

You don’t have to wait for regulatory clarity to start building. The controls below apply regardless of what specific rules ultimately emerge, because they’re grounded in basic compliance hygiene—authorization, transparency, accountability, and monitoring.

Days 1–30: Scope and Constrain

Agent inventory. Add a new field to your AI use case inventory: does this system take direct actions (versus producing outputs for human review)? For every agentic system, document the specific actions it can take.

Pre-authorization frameworks. For each agent, define: action categories permitted, spending/transaction limits, prohibited actions, and escalation triggers. This is your agent’s “permission model”—it should be documented before the agent goes live and reviewed whenever the agent’s capabilities expand.

Circuit breakers. Define the conditions under which an agent stops operating: error rate thresholds, unusual transaction volume, consumer complaint spikes, security alerts. Document who can reset the circuit breaker and what review is required before reinstatement.

Days 31–60: Instrument and Monitor

Action audit logs. Work with your engineering team to ensure every agent action is logged with: timestamp, action type, input context, output action, and confidence/reasoning trace where available. This doesn’t need to be perfect—it needs to be sufficient to reconstruct what happened in a dispute or examination.

Behavioral monitoring. Implement monitoring for: agent communication tone and language drift, transaction volume anomalies, customer complaint patterns tied to agent interactions, and error rates by action type. Set amber/red thresholds and document who receives alerts.

Consumer authorization mapping. Review your consumer authorization disclosures for any product that uses an agentic system. Are consumers explicitly authorizing the types of actions the agent can take? Is the scope of agent authority clearly disclosed? Fix gaps before deployment or as part of your next disclosure update cycle.

Days 61–90: Test and Validate

Adversarial testing. Before deploying any consumer-facing agent—and as part of periodic revalidation—test specifically for: unauthorized action scope expansion, deceptive communication patterns, response to edge cases and manipulation attempts, and behavior under error conditions.

UDAAP sampling. Pull a stratified sample of agent-generated communications. Route them through your existing UDAAP review process. Identify patterns rather than individual messages. Build this into your ongoing monitoring cadence.

Third-party change management. Establish a protocol for foundation model vendor updates: how are you notified? What testing is triggered before resuming production use? What behavioral benchmarks establish the baseline?

So What?

The compliance gap in agentic AI isn’t a theoretical future problem. Agents are in production at financial institutions today. The CFPB is actively working on regulatory frameworks that will fill the current voids. When those frameworks arrive, they will be enforced—and enforcement will look backward at the control environments institutions had in place.

The question isn’t whether agentic AI governance will become a regulatory priority. The GAO’s 2025 report on AI oversight in financial services (GAO-25-107197) made the regulatory trajectory clear: more oversight is coming, specifically for autonomous systems. The question is whether your compliance program has documented evidence that you took this seriously before the examiner arrived.

The practitioners who build these frameworks now—before the rules are finalized—are the ones who have documented evidence of thoughtful governance when examiners show up asking about the agent that sent 50,000 collection notices last quarter.

The practitioners who wait for the final rules are the ones who have “we’re working on it” as their answer.

Build the AI governance program checklist now, before your agentic systems are in scope for a regulatory exam. See also our guide on applying SR 11-7 to AI systems and the broader agentic AI risk management framework for governance controls.

If you’re building or expanding an AI governance program, the AI Risk Assessment Template includes a model inventory with agentic AI fields, pre-deployment checklist, and third-party vendor questionnaire designed for the current regulatory environment.

Frequently Asked Questions

What is the compliance gap with agentic AI?
Agentic AI systems—those that take autonomous actions, call APIs, send communications, and execute transactions without human review—fall outside the assumptions embedded in SR 11-7 (which defines models as producing quantitative estimates), Reg E (which has no authorization framework for AI-initiated transactions), and UDAAP (which has no specific standard for AI-generated consumer communications).
Does SR 11-7 apply to agentic AI systems?
SR 11-7 applies, but imperfectly. It defines models as systems that produce 'quantitative estimates' from 'relatively static representations' of relationships—a definition too narrow for systems that plan, decide, and act autonomously. Validation approaches designed for stable models lose effectiveness when agents recalibrate continuously or take actions outside any scheduled review cycle.
What is the CFPB doing about agentic AI and Reg E?
In August 2025, the CFPB issued an Advance Notice of Proposed Rulemaking on personal financial data rights, specifically seeking comment on who qualifies as a 'representative' acting on a consumer's behalf under Regulation E. The rulemaking is ongoing, but compliance teams shouldn't wait—you need authorization frameworks in place before an agent-initiated transaction becomes a regulatory event.
What should compliance programs build for agentic AI today?
Four immediate priorities: (1) pre-authorization constraint frameworks that define what an agent can and cannot do before deployment, (2) continuous validation protocols rather than periodic review cycles, (3) decision trace logs that satisfy audit trail requirements, and (4) a UDAAP review process for all consumer-facing communications generated by AI agents.
How is agentic AI different from traditional AI models for compliance purposes?
Traditional AI models generate outputs (predictions, scores) for humans to act on. Agentic AI systems act directly—executing transactions, sending messages, calling APIs, and modifying records. This difference in autonomy fundamentally changes every compliance question: who authorized this? who's responsible if it's wrong? how do you validate a system that changes behavior between reviews?
Is there a regulatory framework specifically for agentic AI in financial services?
Not yet. No major regulatory framework—including the Treasury's FS AI RMF (February 2026), SR 11-7, or the NIST AI RMF—has specifically addressed agentic AI governance requirements. The CFPB's data rights ANPRM and the Consumer Bankers Association's January 2026 agentic AI white paper are the closest thing to formal regulatory engagement on this topic.
Rebecca Leung

Rebecca Leung

Rebecca Leung has 8+ years of risk and compliance experience across first and second line roles at commercial banks, asset managers, and fintechs. Former management consultant advising financial institutions on risk strategy. Founder of RiskTemplates.

Related Framework

AI Risk Assessment Template & Guide

Comprehensive AI model governance and risk assessment templates for financial services teams.

Immaterial Findings ✉️

Weekly newsletter

Sharp risk & compliance insights practitioners actually read. Enforcement actions, regulatory shifts, and practical frameworks — no fluff, no filler.

Join practitioners from banks, fintechs, and asset managers. Delivered weekly.