top of page

Ep 2 : The Concept of Federated Learning

Fraud Prevention Series


Three banks in Chennai are fighting a new card-testing fraud. Each sees a tiny piece of the pattern. If they could pool data, the fraud would pop instantly—but privacy rules and contracts say “no.”


Federated learning is like forming a joint “brain” that learns from all banks without anyone sharing raw customer data.


Think: bring the algorithm to the data, not the data to the algorithm. This lets you spot patterns earlier, stay compliant, and protect trust.

Payoff: better fraud catch-rates with zero data pooling.

When you type on your smartphone keyboard:


  • Your phone learns how you type — spelling corrections, shortcuts, and slang.

  • Google doesn’t see your messages — instead, it sends only tiny updates about your typing patterns.

  • These updates are combined from millions of phones to make everyone’s keyboard smarter.

➡️ You benefit from collective intelligence — while your private messages stay private.


This is the base concept of Federated Learning. Federated Learning is like “sharing the wisdom, not the diary.”


Everyone contributes to making the model smarter, but no raw data leaves their place — perfect for privacy-sensitive environments like banking, healthcare, or even retail.

Federated Learning
Federated Learning

SECTION 1 : Recognition and Acceptance


  • Mainstream in AI research Federated Learning was introduced in 2017 by Google and has since been extensively researched in academia and industry.


  • Large tech companies — Google, Apple, Meta, NVIDIA, Tencent, Ant Financial — have deployed FL in production for applications like predictive text, voice recognition, and recommendation systems.


SECTION 2 : Research and Adoption in BFSI


Federated Learning has been actively explored and piloted in banking and financial services, especially for fraud detection, risk scoring, and credit modelling.


a) Fraud Detection


  • Consilient (2025): Building FL-driven fraud detection models for correspondent banking networks to detect mule accounts and suspicious transaction patterns without sharing raw transaction data.


  • UK Finance (2025): Highlighted FL as an emerging technology to fight economic crime, enabling collaboration across institutions while maintaining privacy.


  • Mastercard and R3 pilots (2023–2024): Explored FL for real-time transaction fraud detection across multiple banks.


Think about CIBIL in the early 2000s.Banks were sceptical — no one wanted to share customer repayment data. Privacy, competition, and operational hurdles kept everyone in silos. But slowly, with the right structure and trust framework, they started contributing data. Today, that shared credit bureau is inseparable from every lending decision — retail, MSME, even cards.


Federated Learning is the next evolution of that collaboration. But here, the magic is that raw data never leaves the bank — only the ‘learnings’ or patterns do. The collaborative benefit stays the same, but the privacy risk is near zero.


Just like in 2000–2001, when CIBIL was a concept met with resistance — today, no bank would process a loan without checking the bureau — Federated Learning is poised to become that silent, collaborative layer in credit, fraud, and risk decisioning.


Section 3 : AML and Compliance

  • Multi-bank collaborations are exploring FL for:

    • Suspicious transaction monitoring

    • Sanctions screening pattern enhancement

    • Behavioural anomaly detection in trade finance and cross-border payments


Why BFSI is Interested

  • Regulatory Pressure: Privacy-first models help align with GDPR (Europe), DPDP Act (India), and other data residency requirements.

  • Data Sensitivity: BFSI deals with highly sensitive PII and transactional data that cannot be easily shared.

  • Network Effects: Cross-institution collaboration is key to combating fraud and systemic risks — FL makes this technically and legally feasible.


Section 4 : Proof of Maturity


  • Research Volume: Hundreds of papers between 2020–2025 specifically analyse federated learning in fraud detection, credit scoring, and anti-money laundering.


  • Operational Pilots: From HSBC, JPMorgan, and Deutsche Bank in early pilots to production-grade models at Asian digital banks like Webank and MyBank.


  • Tech Conferences: Federated Learning is a regular topic at NeurIPS, ICML, and KDD — with BFSI-specific sessions.


Section 5 : Challenges (Still Being Researched)


  • Model Poisoning: Attackers could inject biased updates.

  • Heterogeneous Data: Variations in bank data quality and schema need harmonization.

  • Compute/Infra Costs: Requires secure infrastructure for training rounds.

  • Explainability: Federated models need explainability layers for regulatory acceptance in credit decisions.


Section 6 : India Context


  • With DPDP Act (2023) and tighter data localization norms, Indian banks and NBFCs are starting conversations on consortium-driven FL pilots for:

    • Credit bureau model enhancements

    • Multi-bank fraud detection rings

    • SME underwriting using shared alternate data


  • While public deployments are still rare in India, the global maturity curve indicates adoption is imminent.


Where it’s making news (2024–2025 snapshots)

  • Banking/fraud: Industry bodies and vendors are pushing FL for earlier fraud pattern detection across institutions—without sharing raw customer data.


  • Healthcare: Multi-hospital studies and reviews show FL can train strong models while keeping patient data on-prem.


Why it will work well ...

  1. Each bank keeps its data locked.

  2. A common model is sent to each bank’s secure server.

  3. Banks train locally on their own data; they send back only scrambled model tweaks (not records).

  4. A central coordinator averages the tweaks to improve the global model.

  5. Add secure aggregation (so no one sees any single bank’s update) and differential privacy (noise that hides any one customer’s influence).


Quick BFSI use-cases

  • Fraud rings & mule detection (consortium): Learn cross-bank signals (device, velocity, merchant clusters) without moving PII.


  • Correspondent banking AML: Share typologies across partner banks via updates, not data.


  • Credit risk early-warning (NBFC + bank ecosystems): Combine behavior signals without exposing raw ledgers.



Federated Learning: A Tiny Bit Deeper


1️⃣ Common FL Topologies

Type

How It Works

Example in BFSI

Horizontal FL

Parties have similar data features but different users.

Multiple banks with similar schema (KYC, transactions) train a joint fraud detection model.

Vertical FL

Parties have different features about the same user, requiring secure entity matching.

A bank and a telecom provider combine insights (banking + telco usage) for credit scoring.

Federated Transfer / Hybrid

Mix of horizontal and vertical, often with public pretraining before FL rounds.

A bank pretrains on open credit datasets, then fine-tunes with consortium data for SME lending.


2️⃣ Risks & Practical Mitigations


Risk

What It Means (Plain Speak)

Mitigation Strategies

Data Leakage

Model updates could accidentally reveal patterns or rare data points.

Secure aggregation (no one sees individual updates) + Differential Privacy (add calibrated noise).

Model Poisoning / Backdoors

A malicious participant sends corrupted updates to bias the global model.

Robust aggregation, client vetting, drift monitoring, and audit trails.

Operational Complexity

Building and managing FL infra in-house is resource-heavy.

Start with a managed FL platform or open-source stacks like FATE (GitHub).




1️⃣ MSME Lending & Monitoring


a) Alternate Credit Scoring


  • Problem: Many MSMEs are “thin-file” — limited bureau history, irregular income flows.


  • How Federated Learning helps:

    • Combine internal transaction data (current accounts, GST turnover, POS receipts) with peer-bank signals to train better risk models — without exposing raw customer ledgers.

  • Example:

    • Bank A and Bank B each have 50,000 MSME accounts. Training a joint model detects risk triggers like sudden drop in daily credits or spike in bounced cheques much earlier.


b) Early Warning Systems (EWS)

  • Problem: NPAs often spike because early signs are missed.

  • How Federated Learning helps:

    • Create a collaborative anomaly detection model across banks for MSME borrowers.

    • E.g., Payment delays with one bank get flagged in a global model that learns trends across sectors.


c) Sector-Specific Insights

  • Pool anonymized learnings across industries like steel traders, textile SMEs, or logistics operators, to build sector-adjusted PD/LGD models faster than isolated banks.


2️⃣ Large Corporate Lending


a) Syndicated Loans & Consortium Lending


  • Problem: Each bank in a consortium sees partial data about a corporate borrower.

  • How FL helps:

    • Train shared risk models that evaluate cash-flow stress, intercompany linkages, and fund flow anomalies, while complying with confidentiality clauses.

  • Benefit: Stronger collective view for credit reviews, renewals, and early restructuring signals.


b) Fraud Pattern Detection


  • Problem: Complex round-tripping or layered transactions often stay hidden because data is fragmented across banks.

  • How FL helps:

    • Aggregate behavioural patterns (sudden fund flows between related accounts, irregular offshore routing) to flag high-risk patterns without sharing sensitive ledger entries.


c) Trade Finance & Cross-Border AML


  • Use case: Combining transaction, SWIFT, and customs trade data from multiple banks to strengthen AML detection — critical as global compliance norms tighten.


3️⃣ Why FL Fits MSME & Corporate Use Cases


  • Data privacy: Keeps confidential borrower data within the originating bank.

  • Collaborative advantage: Leverages the network effect without regulatory breaches.

  • Regulatory compliance: Aligns with India’s DPDP Act and RBI’s data localization guidelines.

  • Customizable models: Can be tuned for risk scoring, EWS, fraud detection, or cash-flow analysis based on segment needs.


Section 7 : Bottom Line


  • Federated Learning is not an experimental fad anymore — it is a validated, peer-reviewed, and piloted technology across sectors.


  • In BFSI, it’s one of the most promising approaches for collaborative analytics where data privacy and security are paramount.


While most discussions on federated learning focus on cards and retail fraud, the real leap in India lies in MSME and corporate credit. Imagine a world where banks co-learn early warning signals for a ₹50 crore borrower across geographies — without exchanging a single line of raw data. That’s the quiet revolution FL promises.


Who’s doing what (India-focused)

  • NPCI + Banks — live pilot (2025):


    NPCI is piloting a federated AI model with four banks to improve fraud detection/risk scoring on UPI—banks share model insights (not raw data). Multiple mainstream outlets reported this in early April 2025. When asked which ones, NPCI’s CRO declined to name them. Other reports note it’s with “public and private sector banks,” again without listing names. This is sufficient to give you an idea of the revolution it is about to create.


Are we going to still hang on to CFR and Hunter Databases ?



Disclaimer:


The views and opinions expressed in this article are personal and meant for informational and educational purposes only. They do not represent the official stance of any financial institution, regulatory authority, or organization I am associated with. All data and examples, including references to RBI’s Central Fraud Registry (CFR) and Hunter, are used for illustrative purposes. Readers are advised to consult with qualified professionals or their internal compliance teams before making any operational, financial, or policy decisions based on the content shared here.

Comments


© 2025 Vivek Krishnan. All rights reserved.  
Unauthorized use or duplication of this content without express written permission is strictly prohibited.  
Excerpts and links may be used, provided that clear credit is given to Vivek Krishnan with appropriate and specific direction to the original content.

bottom of page