Sanctions Screening Techniques: Tuning False Positives Without Missing Real OFAC Hits

May 8, 2026 • Rebecca Leung •

Table of Contents

TL;DR

OFAC’s SDN List contains 12,000+ designated individuals, companies, vessels, and aircraft — and is updated multiple times per week. Screening against a stale list is its own compliance failure.

The tension in sanctions screening is calibration: too strict a match threshold (approaching 100% exact match) misses real hits due to transliteration and spelling variants; too loose floods alert queues and buries true positives under noise.

Industry practice clusters around 80–90% fuzzy match thresholds. You must document your threshold choice, link it to your risk assessment, and be able to defend it to examiners — not just name the number.

In 2025, OFAC issued 14 enforcement actions totaling $265 million in penalties. Inadequate screening controls appeared as a contributing factor in multiple actions. OFAC also extended record-keeping requirements to ten years starting March 2025.

The compliance analyst who gets 400 alerts a day and clears 398 of them isn’t doing sanctions screening. They’re doing a ritual. The system flags everything, the human clears everything, and the real hit — when it finally shows up — gets lost in the workflow.

That’s the false positive problem in sanctions screening. And it’s not a technology failure — it’s a calibration failure. Match thresholds set too liberally, list scope that wasn’t thought through, name matching algorithms that haven’t been tuned since implementation. The result: alert fatigue that makes your screening program look busy while actually being blind.

Getting calibration right means understanding how name matching algorithms work, where your threshold should sit given your customer population and risk profile, what OFAC actually expects you to document, and — critically — what a true hit looks like when it finally surfaces.

What You’re Screening Against (And Why List Selection Matters)

Most people think “OFAC screening” and think “SDN List.” That’s the right starting point, but it’s not the complete picture.

OFAC maintains more than 30 active sanctions programs. The primary lists:

List	What It Covers
SDN List	Individuals, entities, vessels, aircraft with blocked assets. Most financial institutions are required to screen against this.
Sectoral Sanctions Identifications (SSI) List	Russian entities subject to sectoral restrictions (not full blocking) under EO 13662. Transaction-type restrictions, not full prohibits.
Non-SDN Menu-Based Sanctions (NS-MBS) List	Entities subject to correspondent account or payable-through account restrictions.
Foreign Sanctions Evaders (FSE) List	Foreign individuals/entities who have violated U.S. sanctions on Syria or Iran.
Non-SDN Communist Chinese Military Companies (CMIC) List	Chinese military-industrial complex entities; limits on U.S. investment.
Consolidated Sanctions List	Machine-readable combined list covering all of the above plus additional programs.

For most financial institutions, the minimum is SDN + country-based program restrictions. But your OFAC risk assessment — which should drive your list selection — may require additional lists depending on your customer population, transaction types, and counterparty geographies.

Three countries remain under comprehensive sanctions as of 2026: Cuba, Iran, and North Korea. Any transaction involving these jurisdictions, regardless of the specific counterparty, triggers heightened review.

The list selection mistake: Institutions that only screen the SDN List and skip the consolidated lists miss SSI entities, FSE designees, and sectoral targets. This is specifically what examiners look for when they review list scope decisions.

How Name Matching Algorithms Work

Sanctions screening is fundamentally a name disambiguation problem. The person your customer is transacting with isn’t going to spell their name on the wire exactly as OFAC spelled it on the SDN List. Names get transliterated from Arabic, Russian, or Chinese characters. Aliases are used. Middle names appear or disappear. Entity names change after designation.

Algorithm families commonly used in screening platforms:

Exact Match

String equality — “Ali Hassan Akhbar” matches if and only if the target string is identical. Fast, zero false positives, many false negatives. Not appropriate as a sole matching technique; used as a first-pass filter.

Edit Distance (Levenshtein Distance)

Counts the minimum number of single-character edits (substitutions, insertions, deletions) needed to transform one string into another. “Mohammed” vs. “Mohamad” = distance of 2. Useful for catching typos and minor spelling variants. Less effective for transliteration differences where the underlying structure of the name is preserved but the Latin representation differs substantially.

Jaro-Winkler Similarity

A similarity score between 0 and 1 that weights matching characters and transpositions, with a prefix bonus — meaning matches at the beginning of the name carry more weight. Particularly useful for first names, where the beginning is more distinctive. “Mohamed” vs. “Mohammed” scores high. Less useful for long entity names.

Phonetic Algorithms (Soundex, Metaphone, Double Metaphone)

Convert names to phonetic codes based on how they sound, not how they’re spelled. “Kadhafi,” “Gaddafi,” and “Qadafi” all encode to similar phonetic representations. Highly useful for Arabic and Persian name transliterations where multiple Latin romanizations are in circulation.

Token-Based Matching

Breaks multi-part names into tokens (words) and matches token-by-token. Critical for entity names: “Bank of International Settlement for Development” vs. “International Development Bank” — token overlap is high even though the order differs. Most commercial platforms combine token matching with a primary similarity algorithm.

AI-Assisted Matching

A 2025 Federal Reserve working paper examined whether large language models can improve sanctions screening accuracy through fuzzy matching assessment. Early results suggest LLMs can outperform traditional algorithms in certain cross-language name variant scenarios. Not yet standard practice, but worth monitoring. If you pilot an AI layer, treat it as a model under OCC 2026-13 and validate accordingly.

The False Positive Problem: Why Calibration Is Everything

Here’s the core tension: a screening threshold set too high (approaching 100% exact match) catches almost nothing except literal duplicates. A threshold set too low (say, 60%) generates an alert for every vaguely similar name in your customer base and buries your team in noise.

Neither extreme is defensible. OFAC has cited institutions for both — inadequate matching that missed true hits, and inadequate review processes that allowed false positives to be cleared without documentation.

The practical target: 80–90% similarity score for fuzzy name matching, calibrated to your specific customer population and algorithm suite. This is where most institutions with mature programs land, and it’s the range most commonly referenced in third-party screening platform documentation and examiner guidance.

But the number alone doesn’t make you defensible. What makes you defensible is:

1. Documentation of how you set the threshold

Your threshold should trace back to your OFAC risk assessment. If your customers are predominantly domestic with common English names, 85% may be appropriate. If you serve cross-border populations with names from high-risk jurisdictions, you may need to drop to 80% or supplement your primary algorithm with phonetic matching.

2. Evidence that you tested the threshold

How? Run your current customer and counterparty list through your screening system at multiple threshold levels. Examine the output: What did the 75% threshold catch that the 85% threshold missed? Were any of those additional hits real? What’s the alert volume at each level? This is model validation methodology applied to screening — and OFAC examiners will ask for it.

3. Documented tuning decisions

Every time you change a threshold, update an algorithm, add or remove a list, or adjust screening logic — document it. What changed, who approved it, what testing was done, what the expected impact on alert volume was. Examiners want to see a documented tuning history, not just your current settings.

The False Positive Resolution Process

Not all alerts can be auto-cleared. Here’s a decision framework:

Alert Type	Disposition Approach
Clear non-match	Name similarity high but all other data (DOB, address, nationality, entity type) inconsistent. Document the data points reviewed. Clear with brief notation.
Potential match	Name plus at least one additional data element consistent. Escalate to second review. Pull additional identity verification data. Document the decision logic.
True hit	High name similarity plus multiple corroborating data points. Block the transaction. File blocked transaction report with OFAC within 10 business days. Notify compliance and legal.
Rejected transaction	Payment destined for a sanctioned jurisdiction or counterparty you cannot transact with but do not block (e.g., certain SSI restrictions). File rejected transaction report within 10 business days.

The paper trail on cleared potential matches is where institutions get hurt. If your system generates 400 alerts and analysts clear 398 with no documentation, examiners will question whether those 398 were actually reviewed. Documented clearing decisions — even brief ones — are non-negotiable.

What Examiners Actually Test

OFAC doesn’t audit you directly (unless you’ve had an enforcement action). But the OCC, FDIC, Federal Reserve, NYDFS, FINRA, and FinCEN all examine your OFAC screening program as part of their BSA/AML examination. Here’s what they focus on:

List currency: Are you screening against current lists? Is there a documented SLA for how quickly list updates are consumed? Are daily quality controls verifying list freshness?

Screening scope: Which lists are you screening? Why? Does your list selection match your risk profile? Are customers, beneficial owners, counterparties, and correspondent banks all in scope?

Threshold documentation: What’s your match threshold? How was it set? When was it last reviewed? Is the threshold documented in your procedures?

Tuning history: Has the threshold changed? Why? What testing was done before and after the change?

Alert resolution quality: Pull a sample of cleared alerts. Is the basis for clearing documented? Are potential matches appropriately escalated? Is there a second-reviewer requirement for potential matches?

OFAC blocking and rejection procedures: Do you know the difference? A blocked transaction (SDN hit) requires a different response than a rejected transaction (prohibited counterparty you can legally decline to serve but don’t block).

Record retention: Effective March 2025, OFAC requires ten years of record retention — up from five. If you’re on a five-year policy, update it now.

Name Matching Edge Cases That Catch Teams Off Guard

A few scenarios that generate disproportionate false positives — and how to handle them:

Common names in high-risk jurisdictions: “Mohammed Ali” generates an SDN hit because there are multiple Mohammed Alis on the list. Resolution: supplemental data (DOB, nationality, address, account context) is the tiebreaker. Document the specific corroborating or distinguishing data points.

Transliteration variants: Arabic names transliterate into English in multiple standard ways. Your screening system needs phonetic matching to catch these — exact match won’t. Test specifically for this with a set of known SDN names in Arabic, Persian, or Russian, transliterated multiple ways, run against your system.

Partial name hits: Entity name contains a word that appears in an SDN entity name. “Global Finance Corp” flagging because “Global” appears in an SDN entity. Configure your system to require minimum token overlap before generating an alert — a single-word match in a multi-word name is usually noise.

Married/maiden names: Particularly relevant for PEP screening and sanction programs targeting family members under the 50% Rule. Beneficial ownership records must include all known name variants.

Alias fields: OFAC publishes known aliases for SDN entries. Your system must screen against the alias fields, not just the primary name field. This is a common gap in older or lightly configured screening systems.

Building a Defensible Screening Program

The architecture of an OFAC screening program that holds up at exam:

Governance: Written procedures covering screening scope, list selection rationale, threshold settings, alert resolution requirements, escalation criteria, blocked/rejected transaction procedures, and record retention. Updated at least annually and after material changes.

Pre-screening at onboarding: Customers, beneficial owners, and authorized signatories screened before account opening. If you’re in the payments space, counterparties screened before transaction completion.

Ongoing screening: Periodic rescreening of existing customer base when OFAC list updates occur. Don’t rely solely on transaction-level screening — a customer who was clean at onboarding may be added to the SDN List the following month.

Transaction screening: Real-time or near-real-time screening for payment origination and receipt. Wire instructions (name, address, SWIFT BIC) in scope.

Model validation: Your screening system is a model. It should be subject to initial validation before deployment, periodic revalidation (annually or when logic changes), and independent challenge. Document the validation, including testing methodology and results. The Wolf & Co guide on OFAC model validation outlines what a defensible validation process covers.

KRI integration: Connect screening metrics to your risk monitoring framework. Useful KRIs include: daily alert volume, average alert resolution time, true hit rate (percentage of alerts escalating to potential match or above), and list update lag time. Alert volume spikes or resolution time increases are early warning indicators that the program needs attention. Consider using a KRI library to benchmark your sanctions-specific metrics against standard operational risk indicators.

Linking Screening to Your Broader BSA/AML Program

Sanctions screening doesn’t exist in isolation. The same AML risk assessment that evaluates money laundering risk should inform OFAC screening scope and calibration. High-risk geographies for AML purposes are often also high-risk for sanctions — and the customer due diligence you collect for KYC purposes is the same data you’ll rely on to resolve potential matches.

If your AML program identifies a customer as a higher-risk profile, your OFAC screening should reflect that — more frequent rescreening, enhanced documentation of cleared alerts, and a lower escalation threshold for potential matches.

So What?

If your current sanctions screening process amounts to “the system flags, the analyst clears, we move on,” you have a documentation problem and probably a calibration problem. The question isn’t whether you’re screening — it’s whether your screening program is designed to actually catch real hits, and whether you can prove it.

Start with your threshold. If you can’t explain why it’s set where it is, or when it was last tested, that’s your first action item. Run your current customer base through the system at your current threshold, then at 5% above and below. Compare the delta. Document what you find.

Then look at your clearing documentation. Pull 20 cleared alerts from the last 90 days. Is the basis for each clearing decision documented? Would an examiner who looked at the same alert agree with the conclusion? If not, you have a process gap before you have a false positive problem.

The OFAC Risk Assessment Template covers how to score sanctions exposure across customers, geographies, products, and channels — the foundation that should drive your screening configuration.

For the full BSA/AML picture, the SAR filing and narrative guide covers what happens after screening surfaces a suspicious pattern that goes beyond a potential match.

Frequently Asked Questions

What match threshold should we use for OFAC sanctions screening?

There is no single right answer — OFAC explicitly says institutions must set their own thresholds based on their risk assessment. Industry practice clusters around 80–90% similarity scores for fuzzy name matching. A threshold below 80% typically floods alert queues with noise; above 95% risks missing true hits due to transliteration differences or intentional obfuscation. Document your threshold choice, link it to your risk assessment, and be prepared to explain the rationale — not just the number — to examiners.

What's the difference between a false positive and a potential match in sanctions screening?

A false positive is a system-generated alert where, after review, the screened entity is clearly not the sanctioned person or entity. A potential match — sometimes called a reasonable match — requires human analysis to resolve. The difference matters: regulators want you to resolve potential matches through a documented decision process (identity data, geographic data, DOB, address), not just auto-clear them. False positives that are auto-cleared without documentation are an exam finding waiting to happen.

What fuzzy matching techniques does OFAC use on its own screening tool?

OFAC's Sanctions List Search tool uses fuzzy logic on name fields including character and string similarity matching and phonetic matching. Common algorithm families used in commercial screening tools include Levenshtein distance (edit distance — counts character substitutions, insertions, deletions), Jaro-Winkler (weighted toward the front of names — useful for first names), and phonetic algorithms like Soundex or Double Metaphone that match names based on how they sound rather than how they're spelled. Most enterprise screening platforms combine multiple algorithms and weight them.

Can we use AI or LLMs to reduce sanctions screening false positives?

Emerging approaches are being studied. A 2025 Federal Reserve working paper examined whether LLMs can improve sanctions screening accuracy through fuzzy matching assessment. The results are promising for certain name variation scenarios, but LLM-based screening isn't yet a standard compliance practice. If your institution experiments with AI-assisted screening, treat it as a supplemental layer subject to model risk management — it's a model under SR 11-7 / OCC 2026-13 and requires validation, documentation, and governance before it's defensible at exam.

How often do we need to update our screening lists?

OFAC updates the SDN List and other consolidated lists multiple times per week. Your screening system must consume those updates in near-real-time — or at minimum, within a defined SLA your program documents. Manual or batch updates that lag by more than a day are an exam risk. Examiners have cited institutions for screening against stale lists. Most institutions receiving automated list feeds should validate that the feed is current as part of their daily quality controls.

What records does OFAC require us to keep on screening decisions?

Effective March 2025, OFAC extended the standard record-keeping requirement from five years to ten years. This covers blocked transaction records, rejected transaction records, and the documentation supporting cleared potential matches. Every alert resolution — including your reasoning for clearing a potential match — should be captured in your case management system with timestamp, analyst ID, and the evidence reviewed.

Rebecca Leung

Rebecca Leung has 8+ years of risk and compliance experience across first and second line roles at commercial banks, asset managers, and fintechs. Former management consultant advising financial institutions on risk strategy. Founder of RiskTemplates.

Related Framework

KRI Library (132 Key Risk Indicators)

132 KRIs with thresholds, data sources, and escalation triggers pre-built for financial services.

See What's Included → Buy Now — $49

Keep Reading

Regulatory Compliance

Contingency Funding Plan Evidence Binder: What to Keep Before the Examiner Asks

Examiners don't just read your CFP — they ask for evidence that it works. Here's the complete list of documentation, test records, and artifacts that belong in a CFP evidence binder, organized by funding source and review cycle.

May 15, 2026

Regulatory Compliance

SEC's Final Judgment Against Black Hawk's Robert Newell: How a $37M Cannabis Fund Became a Ponzi Case Study

Robert Newell raised $37M for cannabis funds and used investor money to pay earlier investors. Here's the May 2026 SEC judgment and what private-fund advisers should learn from it.

May 15, 2026

Regulatory Compliance

SEC Adani $18M Settlement: When Anti-Bribery Disclosures Become Securities Fraud

SEC settles Adani Green bond offering case for $18M, charging Gautam and Sagar Adani with materially false anti-bribery statements to US investors.

May 14, 2026

Immaterial Findings ✉️

Weekly newsletter

Sharp risk & compliance insights practitioners actually read. Enforcement actions, regulatory shifts, and practical frameworks — no fluff, no filler.

Join practitioners from banks, fintechs, and asset managers. Delivered weekly.