top of page

Ep1 : Fraud Filters, False Positives & the Missing Rails - Fraud Prevention

Fraud Prevention Series


The frontline reality


A branch approves small‑ticket MSME loans on a tight 48‑hour TAT. Day 1, Hunter throws a “match on mobile + address” alert. Day 2, CFR flags a name similar to a prior fraud, but the branch freezes, the borrower walks away, and a few weeks into this, teams quietly push checks to post‑sanction or bypass them on so‑called “low‑risk” files.This isn’t malice—it’s friction. And it’s why policy intent (use fraud databases) often bows to the pressure of targets.


Yet, let’s be clear: this isn’t a minor issue. Organisations around the world, on average, lose 5% of their annual revenue to fraud each year (Statistics : ACFE )


Consider the implications: for every ₹100 crore of annual business, ₹5 crore is being stealthily drained—often unnoticed for months. Combine that with high false-positive fatigue, and the risk isn’t just financial—it becomes structural.


Fraud Prevention
Fraud Prevention


A SCENARIO


Take the case of Meena, a home-based tailor in Coimbatore who applied for a ₹2.5 lakh loan to buy an embroidery machine. The branch cleared her eligibility in a day, but Hunter flagged a false match — same surname, similar pincode — linking her to an unrelated default 200 km away. By the time the branch verified and cleared the alert, her order window had closed.


That ₹2.5 lakh was never disbursed, but more critically, a small business opportunity slipped away because of systemic noise, not actual risk.


Why false positives happen (esp. in Retail & MSME)


Data realities in India


  • Shared identifiers: Families and micro-enterprises share phones, addresses, devices, even bank accounts.

  • Transliteration & spelling drift: “Shivkumar/Shiv Kumar/Shekhar Kumar” → fuzzy matches light up.

  • Recycled numbers & devices: Prepaid SIM churn; internet café/agent devices reused.

  • Thin-file applicants: Sparse bureau/KYC trails make models lean harder on weak proxies.

  • Unstandardized addresses: PIN-only, market names, or landmarks trigger “near match” noise.

  • Lag & noise in industry feeds: Delayed CFR updates or unclosed cases from other FIs show as “suspected fraud” for months.


Commercial realities

  • TAT pressure: Every extra manual review adds hours; MSME/retail funnels are TAT-sensitive.

  • Blunt thresholds: A single medium-confidence alert blocks everything (high false positive precision cost).

  • Incentives: Branch P&L optimizes disbursal, not model quality; teams default to bypass.


Why this matters: False positives (FPs) erode conversion, customer trust, and risk team credibility. Over time, line units see fraud tools as blockers, not shields—so usage decays.


Why “only CFR & HUNTER”? Where are the others in Fraud Prevention ?


  • CFR (RBI Central Fraud Registry) has regulatory backing and safe-harbor to share adverse data; network effects make it the systemic base layer.

  • Hunter (industry consortium platform) has cross-lender application intelligence (multi-app, identity/device patterns) and mature workflow APIs.


Other sources exist but are not fraud registries:


  • CKYCR, PAN/NSDL, Aadhaar offline KYC → identity validation, not fraud history.

  • Credit bureaus (CIBIL/CRIF/Experian/Equifax) → repayment history & scorecards; some have application-level signals, but sharing adverse fraud intel is constrained by regulation/liability.

  • CRILC/MCA21/GSTN/AA → credit/financial data, useful for anomaly checks, not central fraud flags.


Building “another CFR/Hunter” needs legal cover, governance, liability frameworks, and massive participation—barriers that keep the field concentrated.


What’s improving in existing systems (and what still isn’t)


Real improvements underway (or feasible now):


  1. Match-score transparency: Vendors are exposing confidence scores + reason codes  (e.g., Mobile + PAN 92% vs Address only 58%).


  2. Faster dispositions loop: APIs to write back outcomes (confirmed fraud / false positive / customer clarified). This retrains models and reduces repeat FPs.


  3. Entity resolution 2.0: Better dedupe using phonetic + transliteration models (Soundex / Metaphone + Devanagari/Indic NLP), address standardization (PIN + geocoding).


  4. Device & graph signals: Privacy-preserving device/browser fingerprints and network graphs to catch mule rings—while de-emphasizing weak single-field matches.


  5. Segmented thresholds: Different alert thresholds for gold loans vs. PL vs. BNPL vs. micro-working capital. One-size-fits-all is out.


  6. SLA dashboards: Lenders can now demand precision/recall and turnaround SLAs per product/branch.


Gaps that still bite:


  • Latency: Cross-lender events aren’t always near-real-time.

  • Explainability: Some alerts still read like black boxes; branches won’t act without “why.”

  • Governance drift: No industry-standard aging rules to “sunset” weak, unresolved flags.

  • Consent & privacy anxiety: Post-DPDP, many lenders want clearer templates on purpose limitation and retention—vendors must standardize.


What good looks like (operating model you can deploy)


1) Two-gate triage (machine + human) - Working on a scale of 0 - 100

  • RAG Tagging – Instant rules (sub-second):

    • Autoclear if score ≥ 95 and reason code is strong (e.g., PAN + DOB exact).

    • Autoblock if score ≤ 40 with hard fraud hit (confirmed fraud ID).

    • Review for the middle band with structured “what to ask” prompts.


2) Product-specific thresholds

  • Disbursal-critical (PL/BNPL): higher precision, accept higher FN risk.

  • Collateralized MSME: allow more reviews; precision–recall balanced.


  • Higher Precision


    The fraud detection system is tuned to reduce false positives (FP) — meaning, it tries not to wrongly block genuine customers.

    • Why? Because in PL or BNPL, customer experience is critical. A false decline could mean losing the customer permanently.

    • Example: If a genuine BNPL customer gets blocked at checkout, they’re less likely to return and may even damage the platform’s reputation.


  • Accepting Higher False Negatives (FN) Risk


    Accepting a slightly higher tolerance for fraud slipping through (false negatives) is a trade-off.

    • Why? The cost of friction or delay (from over-blocking) can be greater than the occasional fraud loss, especially when average ticket sizes are relatively small or portfolio margins are high.

    • Example: In a ₹20,000 personal loan or a ₹5,000 BNPL purchase, absorbing an occasional fraud loss may be cheaper than losing dozens of genuine customers due to aggressive fraud filters.


    Practical Example

    • Customer applies for a ₹50,000 personal loan.

      • Score: 92 (above threshold of 90 for instant approval).

      • Action: Loan is auto-approved and disbursed instantly.

    • Risk Trade-off: There’s a small chance the customer might be fraudulent, but the business accepts that risk to keep TAT (turnaround time) below 30 minutes and boost approval rates.


Fraud Control Trade-off Matrix

Dimension


Personal Loan (PL)

BNPL (Buy Now Pay Later)

Transaction Size

High (₹50K – ₹5L)

Low to Medium (₹500 – ₹20K)

Business Priority

Speed with caution

Instant approval & smooth CX

Precision (Low False Positives)

Medium to High – don’t block good customers but keep tighter checks due to higher ticket size

Very High – seamless experience is key

Recall (Low False Negatives)

Very High – fraud slip-throughs can be very costly

Medium – small fraud slips are less damaging

Risk Appetite

Lower – high exposure per account means each fraud hurts more

Higher – small-ticket fraud is acceptable for scale

Common Control Strategy

Multi-layer scoring + manual review for mid-band scores

Instant auto-approval for high scores; automated review for borderline scores

Example Thresholds

Score ≥ 95 → auto-approve; 70–94 → manual; ≤ 69 → block

Score ≥ 90 → auto-approve; 60–89 → automated secondary checks; ≤ 59 → block

Monitoring Focus

Portfolio-level stress testing, historical fraud pattern tracking

Real-time fraud patterns, device fingerprinting, behavioural anomalies

Explanation


  • Personal Loans (PL):Since loans are high-value, lenders emphasize strong recall — detecting as much fraud as possible, even if it means a slightly higher rate of false positives. The process accepts some manual intervention for mid-range risk scores to avoid major losses.


  • BNPL:These are low-ticket, high-volume transactions. Any delay or false rejection hurts customer experience and conversion. Hence, fraud filters are tuned to high precision, accepting that some fraud may slip through. Losses are absorbed in the margins while models keep improving over time.


3) Assisted resolution script (for middle band)

  • Auto-generated 2–3 clarifying questions (vernacular):

    • “This mobile is seen on other recent loan apps. Is it a family/shared number?”

    • “Please upload Udyam & GSTN OTP—quick verify.”

    • “Geo-tag this shop photo; our system mismatched address.”


4) Closed-loop learning

  • Push final disposition back to CFR/Hunter/vendor within 48h; tag FP root cause (transliteration / recycled SIM / agent device).

  • Quarterly champion–challenger: test alternative matchers/thresholds on shadow traffic.


5) Metrics that matter

  • Alert precision (confirmed frauds / total alerts) by product & branch.

  • Average review TAT for middle band (target < 2 hours).

  • Auto clear rate (to protect TAT).

  • False-positive rate & cost per FP (lost fee income + ops time).


Here’s a clear explanation of False-Positive Rate (FPR) and Cost per False Positive (FP) in the context of fraud prevention systems:


False-Positive Rate (FPR)


The False-Positive Rate measures how often your fraud detection system flags a legitimate transaction or customer as fraud.


ree

Example:

  • Total legitimate loan applications processed: 10,000

  • Applications wrongly flagged as fraud: 200


ree

Interpretation: A high FPR means the system is too aggressive and is blocking genuine transactions, leading to:

  • Loss of good customers

  • Damage to customer experience

  • Operational delays from unnecessary manual reviews.


Cost per False Positive (FP)

Every false positive carries a real financial and operational cost. It’s a combination of lost revenue and extra operational effort.


Key components:

  • Lost Fee Income: If a genuine ₹2,00,000 personal loan is wrongly blocked, the bank loses origination fees, interest income, and potential cross-sell opportunities.

  • Operational Time: Manual reviews require analyst effort, adding to overhead. Example: If each review takes 15 minutes and costs ₹200 per case, 200 false positives = ₹40,000 in operational cost.


3. Combined Illustration

Metric

Value

Genuine transactions processed

10,000

False positives

200 (2% FPR)

Avg. loan size

₹2,00,000

Avg. revenue per loan (fees + interest)

₹10,000

Review cost per FP

₹200

Impact Calculation:

  • Lost revenue: ₹10,000 × 200 = ₹20,00,000

  • Ops cost: ₹200 × 200 = ₹40,000

  • Total cost of FPs: ₹20,40,000


Why This Matters

High false positives lead to lost business, higher costs, and strained customer relationships, which is why banks and fintechs constantly fine-tune their fraud models for a better balance between precision (not blocking good customers) and recall (catching fraudsters).



Practical fixes for Retail & MSME false positives

  • Collect richer, low-friction identifiers: Udyam + shop geo-pin + utility bill photo beats one flaky phone number.

  • Co-browse consented checks: When AA/GST fetch or KYC fails alone, combine—and do it with the customer to reduce drop-offs.

  • Agent hygiene: Flag device IDs of sourcing agents; rotate or ban devices linked to high FP clusters.

  • PIN + landmark template: Force address capture as PIN + local market name + photo, then normalize.

  • Language-aware names: Use phonetic search across English/Indic scripts to cut miss-matches.


Where the ecosystem should go (R&D wish-list)


  1. India Fraud Graph (consortium): Privacy-safe, hashed identity graph (PAN/Aadhaar-token/CKYCR ID) that shares confirmed modus operandi and ring connections—not raw PII.

  2. Standard reason/aging codes: Industry-wide taxonomy (e.g., R01 = Translit mismatch, R07 = Recycled SIM) and 90/180-day sunset rules for unsubstantiated hits.

  3. Federated learning: Cross-bank model improvements without moving raw data—keeps DPDP happy, improves recall.

  4. Explainable alerts: Mandatory evidence snippets (“device X used on 5 lenders in 24h”) so branches trust the signal.

  5. Real-time claim-to-cash rails: For guarantee/fraud recoveries—shorter cycles reduce the perceived downside of pausing to investigate.


    We will explore each of these models in the upcoming episodes to understand its purport and efficacies.



Disclaimer:


The views and opinions expressed in this article are personal and meant for informational and educational purposes only. They do not represent the official stance of any financial institution, regulatory authority, or organization I am associated with. All data and examples, including references to RBI’s Central Fraud Registry (CFR) and Hunter, are used for illustrative purposes. Readers are advised to consult with qualified professionals or their internal compliance teams before making any operational, financial, or policy decisions based on the content shared here.

Comments


© 2025 Vivek Krishnan. All rights reserved.  
Unauthorized use or duplication of this content without express written permission is strictly prohibited.  
Excerpts and links may be used, provided that clear credit is given to Vivek Krishnan with appropriate and specific direction to the original content.

bottom of page