KRI Thresholds: How to Stop Your Dashboard From Creating False Greens and False Reds
Table of Contents
Your KRI dashboard is green. Every metric. Every domain. Has been for the past nine months. Your CRO just got asked by the audit committee whether the green dashboard means the firm is well-controlled, or whether it means the dashboard is broken.
If you can’t answer that question with data, the dashboard is broken.
TL;DR
- False greens hide risk because thresholds are set too loose. False reds train people to ignore the dashboard because thresholds are set against noise.
- The fix is statistical, not theatrical: 12-24 months of historical data, threshold set at 95th percentile or risk appetite — whichever is tighter — and a 60-day parallel run before go-live.
- Red threshold must be tighter than or equal to the risk appetite limit for the same risk. If it’s looser, your dashboard is structurally incapable of warning you.
- The deadliest KRI failure mode is “perpetual green” — a metric that has not triggered amber in 18+ months and that nobody checks. Cut it, retire it, or recalibrate it.
Why Most KRI Thresholds Are Wrong on Day One
Walk into any mid-size financial institution and pull up the KRI library. Most thresholds will be round numbers — 5%, 10%, 25 events. They were set by a working group three years ago based on what felt reasonable at the time. Nobody pulled historical data. Nobody computed a distribution. Nobody asked whether the threshold was tighter or looser than the risk appetite statement.
The result is a dashboard that’s wrong in two specific, opposite ways.
Failure mode 1: False greens
The threshold is set looser than it should be. Risk increases, the metric moves, but it stays inside green territory until the loss event actually happens. The dashboard stays calm right up until the moment it doesn’t matter anymore.
This is what happened at Silicon Valley Bank. Per the OIG’s Material Loss Review, SVB had liquidity metrics in place but the thresholds assumed deposit stickiness that didn’t hold in stress. The metrics weren’t lying — they were calibrated against a deposit base that didn’t exist anymore. By the time the early warning indicators actually triggered, the run was already happening.
False greens are dangerous because they generate confidence. The audit committee sees a green dashboard, concludes risk is well-managed, and moves on. The metric is decorative.
Failure mode 2: False reds
The threshold is set tighter than necessary, or against natural variance. The metric trips amber or red on noise — seasonality, normal business variation, one bad week — without any actual risk increase. The first time you investigate, you find nothing. The second time, you find nothing. By the third time, the operational risk team stops investigating amber signals on this metric.
That’s alert fatigue, and it’s not theoretical. According to industry research on KRI dashboard effectiveness, when more than 20% of amber/red signals don’t correspond to a real risk event, the dashboard loses credibility — and the metric is functionally retired even if it stays on the dashboard.
The Threshold Calibration Workflow
There’s a defensible method to setting thresholds. Use it.
Step 1: Pull 12-24 months of metric history
You cannot calibrate a threshold without a distribution. If the metric is new, run it in shadow mode for 90 days before setting any threshold. If the metric existed before but wasn’t tracked formally, pull the historical data from source systems even if it wasn’t in the dashboard.
Compute three things:
- Mean — what’s normal
- Standard deviation — how much normal varies
- 95th percentile — what an unusually high reading looks like under historical conditions
Step 2: Find the risk appetite anchor
Pull the risk appetite statement for the corresponding risk. Find the quantitative limit. The 95th percentile from Step 1 and the appetite limit are your two candidate red thresholds.
Rule: red threshold equals the tighter of the two. If your appetite says 2% and your 95th percentile is 1.4%, your red is 1.4%. If your 95th percentile is 3.5% and your appetite is 2%, your red is 2% — and you have a separate problem because your historical data suggests you’ve been breaching appetite without escalating.
Step 3: Set amber at 70-80% of red
The amber band gives Treasury, ops, or the relevant 1LoD team room to act before red breaches require board escalation. Too narrow (95% of red) and amber becomes a synonym for red. Too wide (50%) and amber loses urgency.
Step 4: Run a 60-day parallel period
Before the threshold goes live in board reporting, run it in parallel: track when amber and red would have triggered, and for each signal, ask whether there was a corresponding real risk event (loss, near-miss, incident, control failure).
| Parallel run signal pattern | What it means | What to do |
|---|---|---|
| >20% of amber/red signals have no real-risk correlate | False reds — threshold too tight or wrong metric | Widen bands or replace metric |
| Real risk events happen without amber triggers | False greens — threshold too loose | Tighten bands |
| 1-5% amber rate, real risk events trigger amber 80%+ of the time | Goldilocks | Go live |
Step 5: Annual recalibration
Re-run the distribution every year. Compare current 95th percentile to last year’s. If it’s drifted materially, the threshold needs to drift too — or the risk environment has shifted and the threshold should hold while you investigate. Document either choice in writing.
The “Perpetual Green” Audit
Pull every KRI in your library. For each one, answer two questions:
- Has this metric triggered amber or red in the last 18 months?
- If amber/red were triggered today, would anyone actually do anything?
If the answer to both is no, the KRI is dead weight. It’s not key — it’s ornamental. The threshold is wrong, the metric is wrong, or the risk it’s monitoring isn’t actually material to the firm.
Three options:
Option A: Recalibrate. Pull the underlying data, redo the distribution analysis from Step 1, and tighten the bands. Most “perpetual green” KRIs come back to life with proper calibration.
Option B: Replace. The metric is measuring the wrong thing. Cyber’s “number of failed login attempts” KRI is famously perpetual-green at most institutions because raw failed logins are not the relevant risk signal — failed logins per privileged account in a 24-hour window is.
Option C: Retire. The risk it monitors isn’t material anymore, or the metric is decorative. Move it to a watch list — tracked but not reported — and free up board reporting bandwidth for KRIs that matter.
Practitioners building libraries from scratch should start with our KRI guide with 50+ examples by risk domain, which covers domain-specific metrics across operational, credit, cyber, liquidity, third-party, and model risk.
Examples: Common KRI Threshold Mistakes
Concrete failure modes from real risk libraries.
Liquidity KRI: LCR threshold set to regulatory minimum
The mistake: Red at 100% LCR (regulatory minimum). Amber at 110%.
Why it fails: By the time LCR hits 110%, the institution is already in serious trouble — wholesale funding has tightened, deposits are draining, and the buffer is being eaten. Amber should trigger at 130-140% LCR for most institutions. Red at 115-120%. The regulatory minimum is not a risk threshold — it’s a violation threshold.
Cyber KRI: critical vulnerability patching SLA
The mistake: Red at “100% of critical vulnerabilities patched within 30 days.” Amber at 95%.
Why it fails: Critical vulnerabilities under NIST SP 800-40 Rev. 4 and modern incident response guidance get exploited in days, not weeks. 30 days as a “critical” threshold is the OS-patching SLA from 2010. Red should be 7-14 days for true criticals. The threshold was set against historical IT practice, not current attacker velocity.
Compliance KRI: BSA SAR filing within 30/60 days
The mistake: Red at “any SAR filed after 30/60 day deadline.” Amber set at 90% timeliness.
Why it fails: A single late SAR is a regulatory issue. Threshold should be 100% red — no amber band — and the metric should be supplemented by leading indicators (alerts pending review beyond 14 days, cases without disposition in 21 days). Lagging metrics with binary breach states need leading-indicator companions.
Operational risk KRI: number of operational loss events
The mistake: Red at “more than 10 operational losses per quarter.”
Why it fails: Loss count without severity tells you nothing. Ten $500 customer-reimbursement events are not the same risk profile as one $2 million wire fraud. Use severity-weighted aggregate loss, or split into count-and-severity matrices.
Third-party risk KRI: percentage of critical vendors with current SOC 2
The mistake: Red at “less than 90% of critical vendors have current SOC 2.”
Why it fails: 90% is a process metric, not a risk metric. The risk question is “which critical vendors don’t have a current SOC 2 and what compensating control covers the gap?” Red should be “any critical vendor without current SOC 2 or documented compensating control” — a 100% threshold tied to remediation status, not aggregate compliance.
Dynamic vs. Static Thresholds
Static thresholds — set annually, applied uniformly — are the default and the right choice for most KRIs. They’re explainable to examiners, auditable, and stable enough that a year-over-year comparison means something.
Dynamic thresholds — recalibrated automatically based on rolling statistics — are increasingly used for metrics with strong seasonality (fraud loss rates, transaction-volume KRIs, ATO attempt rates). They work, but they introduce three problems:
- Explainability. Examiners want to know why the threshold was X. “The model set it” is not an answer.
- Drift risk. A dynamic threshold trained on six months of escalating fraud will treat current fraud levels as normal. The base rate adjustment can mask deteriorating control environments.
- Override discipline. Dynamic thresholds need human override paths with documented criteria. Without them, you’re delegating risk appetite to a model.
If you use dynamic thresholds, write the recalibration algorithm in plain English, document the lookback window, set hard ceilings and floors that the dynamic threshold cannot move past, and require a documented human review before the threshold updates.
How Thresholds Should Connect to Risk Appetite
The red threshold for a KRI must equal or be tighter than the quantitative risk appetite limit for that risk. This is the simplest and most violated rule in KRI design.
| Risk | Appetite Statement | Wrong Threshold | Right Threshold |
|---|---|---|---|
| Liquidity | ”Maintain LCR above 120% under base case” | Red at 100% LCR | Red at 120% LCR |
| Credit | ”Net charge-offs below 1.2% annualized” | Red at 1.5% | Red at 1.2% |
| Cyber | ”Zero critical vulnerabilities open beyond 14 days” | Red at >5% open | Red at any critical past 14 days |
| AML | ”100% SARs filed within statutory deadline” | Red at <95% timeliness | Red at any late SAR |
When the threshold is looser than appetite, the dashboard cannot warn you that you’ve breached appetite. By design.
For practitioners thinking about how KRIs feed into broader risk reporting, our piece on RCSA methodology and workshop facilitation walks through how risk ratings from the RCSA process should map cleanly to KRI red/amber breakpoints.
So What?
KRI thresholds are not a calibration detail. They’re the difference between a dashboard that warns you and a dashboard that lies to you.
Three actions to take this week:
- Pull the perpetual-green list. Any KRI that hasn’t triggered amber in 18 months goes on the audit list. Recalibrate, replace, or retire.
- Cross-check thresholds against appetite. For every quantitative appetite statement, find the matching KRI red threshold. If red is looser than appetite, fix it before the next board meeting.
- Document the 60-day parallel run policy. Any new KRI or recalibrated threshold goes through a parallel run before it goes live in board reporting. Get the policy in writing.
Our KRI Library ships with 50+ pre-calibrated KRIs across domains, threshold rationales, and the parallel-run template — built so the calibration work happens once, not every time the board asks why the dashboard is green.
Green dashboards make leadership comfortable. Comfortable leadership doesn’t ask questions. The question the audit committee should be asking is not “are we green?” — it’s “would we know if we weren’t?”
Need the working template?
Start with the source guide.
These answer-first guides summarize the required fields, evidence, and implementation steps behind the templates practitioners search for.
Related Template
KRI Library (132 Key Risk Indicators)
132 KRIs with thresholds, data sources, and escalation triggers pre-built for financial services.
Frequently Asked Questions
What's a false green KRI?
What's a false red KRI?
How do you actually calibrate a KRI threshold?
Should KRI thresholds be static or dynamic?
What's the relationship between a KRI threshold and a risk appetite statement?
How many KRI thresholds should a single metric have?
Rebecca Leung
Rebecca Leung has 8+ years of risk and compliance experience across first and second line roles at commercial banks, asset managers, and fintechs. Former management consultant advising financial institutions on risk strategy. Founder of RiskTemplates.
Related Framework
KRI Library (132 Key Risk Indicators)
132 KRIs with thresholds, data sources, and escalation triggers pre-built for financial services.
Keep Reading
Funding Sources Aren't Real Until Tested: How to Prove Your Contingency Funding Plan Works
Most CFPs list contingent funding sources without proving they're accessible. Here's how to run fund-flow tests, build an evidence file, and show regulators that your liquidity plan actually works when it needs to.
May 15, 2026
Operational RiskOperational Risk Scenario Analysis: Building 'Severe But Plausible' Scenarios That Satisfy Internal Audit and the OCC
A practitioner's guide to designing, facilitating, and defending operational risk scenario analysis — from workshop setup and expert elicitation to loss estimation and ICAAP integration.
May 13, 2026
Operational RiskOperational Loss Data Collection: Building a Loss Event Database That Satisfies Examiners and Feeds Your Risk Program
Most operational risk programs have an RCSA. Far fewer have a loss event database that's actually current, classified, and connected to risk monitoring. Here's how to build one that satisfies examiner expectations and makes the rest of your ORM program credible.
May 11, 2026
Immaterial Findings ✉️
Weekly newsletter
Sharp risk & compliance insights practitioners actually read. Enforcement actions, regulatory shifts, and practical frameworks — no fluff, no filler.
Join practitioners from banks, fintechs, and asset managers. Delivered weekly.