14 Jul 2025

Min Read

Unmasking Illicit Finance: Building a Real-Time AML Inference Pipeline with LLMs and DeltaStream

Why LLMs for Anti-Money Laundering?
Why LLMs for Anti-Money Laundering?
Why DeltaStream for Real-Time AML?
Why DeltaStream for Real-Time AML?
Architecting the Real-Time AML Inference Pipeline
Architecting the Real-Time AML Inference Pipeline
Benefits and Challenges
Benefits and Challenges
Conclusion
Conclusion

Hojjat Jafarpour

Founder & CEO

The digital age has brought unprecedented speed and complexity to financial transactions, making the fight against Anti-Money Laundering (AML) more challenging than ever. Traditional, rule-based AML systems, often reliant on batch processing, struggle to keep pace with sophisticated financial criminals who exploit these very characteristics

Enter the power of real-time processing combined with the analytical prowess of Large Language Models (LLMs). Imagine a system that can detect suspicious patterns, anomalies, and hidden relationships in financial flows as they happen, enabling immediate intervention. This isn't a distant dream; it's achievable today by leveraging platforms like DeltaStream to construct a robust, real-time AML inference pipeline powered by LLMs.

Why LLMs for Anti-Money Laundering?

LLMs have revolutionized natural language understanding and generation, but their utility extends far beyond chatbots. In AML, LLMs offer a significant leap over conventional methods due to their ability to:

Uncover Complex Patterns: Unlike rigid rules, LLMs can learn intricate, non-obvious patterns within vast datasets of transaction descriptions, communications, and customer data. This includes identifying "structured" transactions (like smurfing, where large sums are broken into smaller amounts) that are designed to evade detection.
Contextual Understanding: LLMs can process unstructured data (e.g., free-text fields, notes) to extract crucial context, infer intent, and link seemingly disparate pieces of information.
Behavioral Anomaly Detection: By learning "normal" customer behavior from historical data, LLMs can flag deviations, enabling proactive identification of risky activities as they emerge.
Automate and Enhance Reporting: They can assist in generating more accurate and detailed Suspicious Activity Reports (SARs) by summarizing relevant information and even pre-populating forms.
Reduce False Positives: While a challenge, well-tuned LLMs can potentially reduce the high false positive rates often associated with traditional AML systems, allowing human analysts to focus on truly suspicious cases.

Why DeltaStream for Real-Time AML?

Building a real-time AML pipeline demands a platform capable of handling continuous data streams, performing complex transformations, and serving insights with minimal latency. DeltaStream, built on Apache Flink, is ideally suited for this role:

Real-Time Data Ingestion & Processing: DeltaStream excels at ingesting massive volumes of streaming financial transaction data from sources like Kafka or Kinesis. Its Flink-powered engine ensures low-latency processing.
SQL-Native Stream Processing: It allows financial institutions to define sophisticated real-time analytics and transformations using familiar SQL, simplifying the development of complex feature engineering logic.
Unified Streaming Catalog & Governance: DeltaStream provides a centralized catalog for all streaming data, alongside robust Role-Based Access Control (RBAC), ensuring data security and compliance in a highly regulated industry.
Materialized Views for Operational Efficiency: It can maintain real-time materialized views of aggregated data, crucial for powering dashboards, alerts, and downstream systems.
Cost Efficiency: By "shifting left" data processing from batch data warehouses, DeltaStream can reduce overall operational costs associated with large-scale analytics.

Architecting the Real-Time AML Inference Pipeline

A real-time AML pipeline combining LLMs and DeltaStream typically follows a Feature/Training/Inference (FTI) architecture:

1. Data Ingestion & Feature Engineering (Powered by DeltaStream)

Streaming Sources: Financial transactions, customer updates, sanctions lists, and other relevant data stream continuously into DeltaStream.
Real-time Transformations: DeltaStream processes these raw streams. This is where crucial feature engineering takes place. Using SQL, you can:
- Extract entities (names, addresses, transaction types, amounts).
- Calculate real-time aggregations (e.g., total transactions in the last 5 minutes for a customer).
- Join streaming data with static reference data (e.g., customer profiles, known risky entities).
- Generate textual features for LLM input (e.g., combining transaction description with counterparty name).
Feature Store Integration: The engineered features are often pushed to a real-time feature store (like Feast), ensuring consistency between the data used for training the LLM and the data used for live inference.

2. LLM Training (Offline/Batch)

Historical Data: The LLM is trained on historical, labeled financial data (transactions, customer information, past SARs, known illicit activities). This training leverages the features prepared by the feature pipeline.
Model Selection & Fine-tuning: This involves choosing an appropriate LLM architecture and fine-tuning it for AML-specific tasks like anomaly detection, risk scoring, and suspicious activity classification.

3. Real-Time LLM Inference (DeltaStream as the Orchestrator)

Triggering Inference: As new, real-time features are generated by DeltaStream, they are fed to the deployed LLM for inference.
LLM Prediction: The LLM analyzes these features, identifies potential anomalies, unusual patterns, or links to known illicit activities. It can provide a risk score, classify the type of suspicious activity, or even generate a preliminary narrative.
DeltaStream for Post-Inference Processing: The LLM's predictions are streamed back into DeltaStream. Here, further real-time logic can be applied:
- Thresholding & Rule Application: Apply business rules or thresholds to LLM scores to filter out low-risk alerts.
- Alert Generation: If a transaction meets the criteria, DeltaStream can trigger immediate alerts to AML analysts.
- Contextual Enrichment: Enrich the alert with additional real-time context from other streams (e.g., customer’s recent activity, news related to involved entities).
- Downstream System Integration: Push alerts and enriched data to case management systems, fraud investigation platforms, or even directly to transaction blocking systems.

4. Feedback Loop & Continuous Improvement

Analyst Feedback: Insights from human analysts (e.g., confirmed money laundering cases, false positives) are fed back into the system.
Model Retraining: This feedback is crucial for retraining and fine-tuning the LLM, ensuring it adapts to new money laundering techniques and improves its accuracy over time.

Benefits and Challenges

Benefits:

Speed and Agility: Detect and react to financial crime in real-time, significantly reducing the window for illicit activities.
Enhanced Accuracy: LLMs can uncover subtle and complex patterns that traditional rules miss, leading to more precise detection.
Reduced False Positives: While still an area of development, LLMs have the potential to minimize the investigative burden by focusing on higher-probability threats.
Scalability: Both LLMs and DeltaStream are designed for scalability, handling massive volumes of data and requests.
Adaptability: LLMs can adapt to evolving money laundering typologies with continuous retraining.

Challenges:

Explainability: Understanding why an LLM flagged a transaction can be complex, posing challenges for regulatory compliance and investigations. Techniques like LIME or SHAP are crucial here.

Data Quality and Bias: LLMs are highly dependent on the quality and representativeness of their training data. Biases in historical data can lead to unfair or inaccurate predictions.

Computational Cost: Running LLM inference in real-time can be computationally intensive, requiring optimized serving infrastructure.

Regulatory Scrutiny: The use of AI, especially "black box" models, in critical areas like AML is under increasing regulatory scrutiny, demanding robust governance and auditability.

Conclusion

The combination of LLMs and DeltaStream represents a powerful paradigm shift in the fight against financial crime. By building real-time inference pipelines, financial institutions can move from reactive to proactive AML, leveraging the deep analytical capabilities of LLMs for sophisticated anomaly detection and the unparalleled speed and processing power of DeltaStream for continuous, actionable intelligence. As these technologies mature and best practices evolve, we can anticipate a significant strengthening of our defenses against money laundering in the real-time financial world.

This blog was written by the author with assistance from AI to help with outlining, drafting, or editing.

Hojjat Jafarpour

Founder & CEO

Unmasking Illicit Finance: Building a Real-Time AML Inference Pipeline with LLMs and DeltaStream

Table of contents

Why LLMs for Anti-Money Laundering?

Why DeltaStream for Real-Time AML?

Architecting the Real-Time AML Inference Pipeline

1. Data Ingestion & Feature Engineering (Powered by DeltaStream)

2. LLM Training (Offline/Batch)

3. Real-Time LLM Inference (DeltaStream as the Orchestrator)

4. Feedback Loop & Continuous Improvement

Benefits and Challenges

Benefits:

Challenges:

Conclusion

7 Ways to Slash Your Snowflake Costs with DeltaStream

Unlocking Instant Intelligence: Why DeltaStream is Your Real-Time Inference Powerhouse for LLMs

A Guide to Stateless vs. Stateful Stream Processing

Table of contents

Why LLMs for Anti-Money Laundering?

Why DeltaStream for Real-Time AML?

Architecting the Real-Time AML Inference Pipeline

1. Data Ingestion & Feature Engineering (Powered by DeltaStream)

2. LLM Training (Offline/Batch)

3. Real-Time LLM Inference (DeltaStream as the Orchestrator)

4. Feedback Loop & Continuous Improvement

Benefits and Challenges

Benefits:

Challenges:

Conclusion

7 Ways to Slash Your Snowflake Costs with DeltaStream

7 Ways to Slash Your Snowflake Costs with DeltaStream

Unlocking Instant Intelligence: Why DeltaStream is Your Real-Time Inference Powerhouse for LLMs

Unlocking Instant Intelligence: Why DeltaStream is Your Real-Time Inference Powerhouse for LLMs

A Guide to Stateless vs. Stateful Stream Processing

A Guide to Stateless vs. Stateful Stream Processing

Request Submitted

Share this blog post