Blueprint 1 · Real-Time ML Inference
ML inference at the data layer. No external model serving required.
Execute ML inference directly within streaming data pipelines—where transactions happen, before data moves.
detection latency, Kafka path
inference paths validated on PoC infrastructure
data on-premises; no cloud ML service dependencies
ARCHITECTURE
How it works
How data moves
Architecture flow
| 01 | SOURCE | Transaction events write directly to EDB Postgres® AI (EDB PG AI) as the operational system of record. EDB Postgres Distributed is the source of truth—four downstream analytics and inference paths read from it via WAL, eliminating the need to move data before processing begins. |
| 02 | STREAM | Debezium captures WAL changes and streams them to Kafka. Kafka routes events to three parallel downstream paths: Kafka direct, ClickHouse (via CDC), and RisingWave (via CDC). A fourth path—PGAA—reads the WAL independently of Kafka. (Note: Redpanda is a validated Kafka-compatible alternative.) |
| 03 | AGGREGATE | RisingWave computes streaming materialized views: rolling spending patterns, geo-anomaly scores, and velocity checks. ClickHouse runs 90-day behavioral baseline comparisons for complex historical aggregations. |
| 04 | INFER | XGBoost ML models execute inference across all four paths with measured latencies from the PoC: Kafka direct <100ms TTDF; PGAA ~1500ms (WAL → Iceberg); ClickHouse ~3000ms; RisingWave ~4200ms. (Notes: KServe + NVIDIA NIM is supported in EDB PG AI but not in the current open source release. MLflow is supported but not integrated in the current release.) |
| 05 | GOVERN | Human-in-the-loop decision gates enforce policy on high-risk agent actions before execution. Lakekeeper (Vakamo) provides the Iceberg REST catalog—governing data lineage and access control across all four inference paths. |
| 05 | PERSIST | Fraud predictions and operational alerts write back to EDB PG AI. Historical inference results sink to Iceberg tables on MinIO for audit, model retraining, and longitudinal analytics. |
Blueprint 1 · PARTNER STACK
Validated partners in this blueprint
Airflow (Astronomer)
dbt
Grafana
Jupyter
Kafka / Redpanda
KServe + NVIDIA NIM
Lakekeeper (Vakamo)
Langflow
MinIO
MLflow
RisingWave
INDUSTRY USE CASES
Blueprint 1 in production
-
BFSIBFSI
Real-time fraud detection
A tier-1 bank processes 50,000 transactions per second. Batch-based fraud detection carries a 15-minute detection window. Deploying XGBoost within the Kafka streaming path cuts detection latency from 15 minutes to 500ms. Three parallel inference paths provide immediate blocking decisions and behavioral context enrichment simultaneously. All data remains on-premises—no cloud ML service dependencies.
Validated deployment environments
Runs on-premises, on IBM Power, on EDB engineered systems, or on any cloud—with consistent Postgres interface across all environments.
Try it now—Blueprint 1 is open source.
The full BFSI fraud detection implementation is on GitHub.
Deploy it, fork it, or build on it.