Risk Scoring Techniques: Likelihood x Impact and the 4 Variations Examiners Push Back On
Table of Contents
TL;DR
- Likelihood × Impact is the standard — but four application errors turn it from a defensible tool into an examination finding generator
- The most common single finding: inherent and residual scores nearly identical, signaling controls were never actually evaluated
- “Problems with scoring methods and ordinal scales in risk assessment” formally documented why multiplying ordinal scales produces pseudo-quantitative results — use L×I for banding decisions, not fine-grained comparisons
- Impact thresholds must be calibrated to your institution’s size — a $200M credit union and a $10B bank cannot share the same dollar definitions of “Catastrophic”
You spent three days building your risk and control self-assessment. Forty risks, each scored on a 5×5 likelihood-impact matrix, heat map color-coded, formatted for the board deck. Then the examiner opens to risk #7 — Fraud Risk in Account Opening — and asks: “Walk me through how you arrived at a composite score of 20.”
And you realize: you don’t have a defensible answer.
This is the silent failure in most operational risk programs. The L×I formula is nearly universal — endorsed by COSO ERM, ISO 31000, and accepted by every bank examiner in the U.S. But how teams apply it determines whether the RCSA holds up to scrutiny or generates findings. Four variations account for most of the pushback.
The Math Problem Nobody Talks About
Before the four variations: a confession about the formula itself.
Likelihood × Impact sounds quantitative. It isn’t. The 1–5 scale you assign to likelihood is an ordinal scale — it ranks risks in order of rough probability but doesn’t measure equal intervals between ratings. A “3” likelihood isn’t exactly 1.5× more likely than a “2.” It just means “more probable.” When you multiply two ordinal numbers, you produce a risk priority number that looks precise but carries the same ordinal limitations.
The academic record on this is unambiguous. The 2010 paper “Problems with scoring methods and ordinal scales in risk assessment” (IBM Journal of Research and Development) established that ordinal-scale multiplication routinely produces reversed rankings and uninformative ratings — assigning identical scores to risks that differ by orders of magnitude, or ranking lower-risk scenarios above higher-risk ones. FMEA programs using Risk Priority Numbers run into the same structural problem.
None of this means you should abandon L×I. Most regulatory frameworks — COSO ERM and ISO 31000:2018 included — accept it as a practical approximation, and most examiners expect it. The problem arises when programs treat the output as more precise than it is: ranking Risk #37 (score 15) as meaningfully more concerning than Risk #22 (score 14) as if the one-point difference reflects real quantitative measurement. It doesn’t.
The defensible position: use L×I to categorize risks into bands (Critical, High, Medium, Low). Make decisions at the band level. Never present adjacent scores as precise comparisons across different risk categories.
Variation 1: Inherent and Residual Scores Nearly Identical
This is the single most common RCSA finding in operational risk examinations — and the most telling signal that an assessment didn’t actually evaluate controls.
The theory is clear: inherent risk is the raw exposure before any controls; residual is what remains after. The gap between them quantifies the work your control environment is doing. A wide gap means strong controls with evidence. A narrow gap means either your controls are nearly ineffective or you didn’t evaluate them.
When assessors score inherent and residual in the same workshop, at the same time, for the same risk — the social dynamics of the room drive scores toward similarity. Nobody wants to say their business unit is catastrophically risky. So they assign an inherent score, knock it down a few points, and call it residual. The gap averages 10–15% across the portfolio.
An examiner sees a portfolio where every inherent score of 20 resolves to a residual of 17 or 18 and has one interpretation: the team started from the residual and worked backward. No real control evaluation occurred.
| Control Effectiveness | Expected Inherent-to-Residual Shift |
|---|---|
| Strong (tested, KRI-supported, audit opinion “effective”) | 60–75% risk reduction |
| Adequate (tested periodically, some gaps) | 40–60% reduction |
| Weak (design issues or testing gaps identified) | 10–30% reduction |
| Not Tested | Residual should remain near inherent until testing occurs |
Fix: Decouple the control evaluation from the inherent scoring. Before assigning a residual score, require an explicit control effectiveness rating (Strong, Adequate, Weak, Not Tested) for each control. Tie that rating to evidence — KRI data, last audit opinion, most recent control test result, loss event history. Document the evidence in the RCSA. The residual score then flows from the control effectiveness rating, not from intuition.
If there’s no evidence for a control rating, the control should be rated “Not Tested” and the residual should reflect that uncertainty — staying near inherent until testing is completed.
This is the structural fix that the RCSA methodology addresses in detail: separate the inherent discussion from the control evaluation, and require evidence before the residual is assigned.
Variation 2: Impact Thresholds Not Calibrated to Your Institution
Generic templates are convenient. They’re also one of the most reliable sources of examiner pushback for community and regional banks.
A template that defines “financial impact > $500,000 = Catastrophic” produces absurd results when applied without context. At a $50B regional bank, $500K is manageable. At a $200M community bank, $500K might represent a significant portion of quarterly net income. Same threshold, opposite practical meaning.
ISO 31000:2018 explicitly requires risk assessment to be conducted in the context of the organization — which includes size, complexity, and risk appetite. COSO ERM’s 2017 framework makes the same point in its guidance on setting risk tolerances: thresholds must be calibrated to what is material to your organization, not imported from a template written for a generic institution.
The tell in an examination: when a $200M credit union uses the same impact calibration table as a $10B bank because both used the same off-the-shelf RCSA template. Examiners at both the OCC and FDIC — whose Risk Management Manual of Examination Policies addresses risk rating adequacy — will ask whether your thresholds are calibrated to your actual risk profile.
Fix: Write your Catastrophic definition in organizational terms first: “An event that would materially impair our ability to operate or would require emergency board intervention.” Then translate that principle into dollar amounts, customer counts, regulatory consequences, and reputational harm specific to your institution’s size and risk appetite. Assign numbers to each severity level proportionally. Document the calibration rationale. When your size or risk profile changes materially — acquisition, product launch, regulatory consent order — revisit the calibration.
The COSO ERM Framework guide covers this in the context of setting risk appetite and tolerance — the same calibration exercise, applied to the ERM architecture that feeds your RCSA.
Variation 3: Scores Without Supporting Evidence
You’ve rated Fraud Risk as Likelihood 5 × Impact 5 = 25. The examiner asks: “What evidence supports a likelihood rating of 5?” If the answer is “we discussed it in the workshop” or “it seemed right” — that’s a finding.
RCSA workshops are social processes. The most senior person in the room often anchors the score, the group adjusts toward consensus, and the discussion ends when agreement forms. This produces ratings that reflect hierarchy and gut feel, not observable data. Examiners increasingly expect scores to be grounded in something more durable.
Evidence that supports likelihood scores:
- Internal loss events from the past 12–24 months (count, dollar amount, frequency trend)
- Near-misses and operational incidents logged in the loss event database
- KRI trend data — are leading indicators moving toward or away from threshold?
- Internal audit findings in the relevant process area
- Industry loss data or regulatory enforcement actions in comparable peer institutions
Evidence that supports impact scores:
- Historical cost of similar events at your institution
- Regulatory penalty ranges for the relevant risk type
- Customer complaint volume or churn associated with similar events
- Estimated recovery time and operational disruption cost
The KRI Library exists precisely for this: pre-built indicators that give your team observable data points to anchor likelihood scores before the workshop begins. A pre-populated evidence brief — distributed before the session, not assembled during it — changes the conversation from “what do we feel?” to “what does the data show?”
None of this requires perfect data. A single documented evidence citation per dimension — “Likelihood = 4 based on three payment fraud events in the prior 24 months totaling $180K in recoverable losses” — transforms a subjective gut check into a defensible position.
Variation 4: Using Composite Scores to Compare Across Risk Categories
This one is subtler — and shows up most often in board risk reporting.
When you rank “Cybersecurity Risk (score 21)” above “Regulatory Compliance Risk (score 18)” in a single league table, you’re implying they can be meaningfully compared on the same quantitative scale. In most RCSA designs, they can’t.
Cybersecurity and compliance risks have different consequence types, different evidence bases, different control environments, and different regulatory treatments. Their composite scores are ordinal rankings within their respective domains. Treating the scores as directly comparable produces misleading prioritization — one that regulators and internal auditors increasingly push back on at larger, more sophisticated institutions.
The Basel Committee’s Principles for the Sound Management of Operational Risk (PSMOR, 2021) emphasizes that operational risk programs should use multiple data inputs and avoid over-reliance on any single metric. Cross-category score comparisons as a primary prioritization mechanism runs counter to this principle.
Fix: Present L×I scores as within-category prioritization tools. For the board risk report, use qualitative judgment — informed by the scores, but not mechanically produced by them — to structure the risk narrative. “Our top five operational risk concerns this quarter are…” followed by narrative context is more defensible than a league table of composite scores from different risk domains.
So What?
None of these four variations require rebuilding your program. They require tightening the discipline around how scores get assigned and documented.
The practical path forward:
-
Add a control effectiveness rating (Strong, Adequate, Weak, Not Tested) as a required field in your RCSA template, separate from the residual score. The residual flows from the rating, not the other way around.
-
Calibrate impact thresholds to your institution’s actual size and risk appetite — in writing, with a documented rationale. Revisit whenever your profile changes materially.
-
Build a pre-work evidence brief for each RCSA cycle: pull internal loss events, KRI trend data, and audit findings before the workshop. Require one evidence citation per risk dimension.
-
Adopt band-based decision-making rather than rank-ordering by composite scores. Decisions happen at the Critical/High/Medium/Low band level. Adjacent scores within a band are noise, not signal.
These changes address the most common examination findings without a full program rebuild — and they make the RCSA a more accurate reflection of where your actual risk lies.
The RCSA (Risk & Control Self-Assessment) template provides a structured framework with separate inherent scoring, control effectiveness rating fields, evidence documentation prompts, and a pre-populated evidence brief template — built to address the four variations examiners flag most often.
Need the working template?
Start with the source guide.
These answer-first guides summarize the required fields, evidence, and implementation steps behind the templates practitioners search for.
Related Template
RCSA (Risk & Control Self-Assessment)
141 pre-populated fintech risks with control assessments, questionnaire framework, and testing calendar.
Frequently Asked Questions
Why is multiplying likelihood × impact scores technically flawed?
How large should the gap be between inherent and residual risk scores?
What evidence should back a likelihood score in an RCSA?
How should impact thresholds be calibrated to our institution?
Is it a problem to compare risk scores across different risk categories?
What's the fastest fix if my RCSA has these problems?
Rebecca Leung
Rebecca Leung has 8+ years of risk and compliance experience across first and second line roles at commercial banks, asset managers, and fintechs. Former management consultant advising financial institutions on risk strategy. Founder of RiskTemplates.
Related Framework
RCSA (Risk & Control Self-Assessment)
141 pre-populated fintech risks with control assessments, questionnaire framework, and testing calendar.
Keep Reading
Funding Sources Aren't Real Until Tested: How to Prove Your Contingency Funding Plan Works
Most CFPs list contingent funding sources without proving they're accessible. Here's how to run fund-flow tests, build an evidence file, and show regulators that your liquidity plan actually works when it needs to.
May 15, 2026
Operational RiskKRI Thresholds: How to Stop Your Dashboard From Creating False Greens and False Reds
Set KRI thresholds that actually warn before risk materializes. Calibration methods, the 60-day parallel run, and how to fix dashboards stuck in alert fatigue or perpetual green.
May 15, 2026
Operational RiskOperational Risk Scenario Analysis: Building 'Severe But Plausible' Scenarios That Satisfy Internal Audit and the OCC
A practitioner's guide to designing, facilitating, and defending operational risk scenario analysis — from workshop setup and expert elicitation to loss estimation and ICAAP integration.
May 13, 2026
Immaterial Findings ✉️
Weekly newsletter
Sharp risk & compliance insights practitioners actually read. Enforcement actions, regulatory shifts, and practical frameworks — no fluff, no filler.
Join practitioners from banks, fintechs, and asset managers. Delivered weekly.