09 Dec 2025

Min Read

DeltaStream for Databricks: True Real-Time Streaming with Apache Flink, Now Inside the Databricks UI

Streaming & CDC Visibility, Right Inside Databricks
Streaming & CDC Visibility, Right Inside Databricks
From Micro-Batch to True Continuous Streaming
From Micro-Batch to True Continuous Streaming
Continuously Updated Iceberg Tables (No More Staleness)
Continuously Updated Iceberg Tables (No More Staleness)
Real-Time Analytics with ClickHouse-Powered Materialized Views
Real-Time Analytics with ClickHouse-Powered Materialized Views
The Best of Both Worlds: Databricks + DeltaStream(Flink + ClickHouse)
The Best of Both Worlds: Databricks + DeltaStream(Flink + ClickHouse)

Hojjat Jafarpour

Founder & CEO

Rick Curtis

Software Engineer

For years, Databricks has been the center of gravity for data engineering and batch processing. Its unified Lakehouse approach revolutionized how organizations handle data at scale. However, when it comes to real-time data processing, many data engineers hit an architectural ceiling.

Databricks Structured Streaming, while powerful, is fundamentally built on a micro-batch architecture. For many use cases, processing data in small batches every few seconds is acceptable. But for true real-time scenarios such as fraud detection, live IoT monitoring, or dynamic pricing, sub-second latency isn’t just a "nice to have"; it’s a requirement.

Furthermore, working with streaming sources in Databricks often feels like operating in the dark. There is little to no visibility into what is actually happening in your Kafka topics, Kinesis streams, or CDC logs until you build a pipeline to land that data into a Delta table.

Today, that changes.

We are excited to announce DeltaStream for Databricks, a native in-UI integration that brings true, continuous Apache Flink streaming and real-time materialized views directly into the Databricks notebook experience. Databricks users can now explore, inspect, transform, and continuously process high-velocity Kafka/Kinesis streams and CDC data sources without ever leaving the Databricks UI.

This integration brings the power of Apache Flink, the gold standard for true stream processing, along with real-time materialized views powered by ClickHouse, right to where Databricks users already work.

This video shows how easy it is to build an end to end streaming pipeline to bring power of Apache Flink into Databricks UI:

Streaming & CDC Visibility, Right Inside Databricks

Previously, if a Databricks user wanted to know what was in a Kafka topic, they had to write a blind ingestion job, wait for a micro-batch to trigger, land the data, and then query it.

With the new DeltaStream integration, the "black box" of streaming sources is opened. Right from a Databricks Notebook, users can now use DeltaStream commands to instantly explore and inspect streaming data stores. You can see Kafka topics, Kinesis streams, and CDC tables before you write a single line of ingestion logic. This unprecedented visibility drastically accelerates development and debugging cycles.

From Micro-Batch to True Continuous Streaming

Structured Streaming is fundamentally micro-batch. Even with low-latency configurations, workloads run as small batches, not as a continuously executing dataflow. This introduces:

seconds (or more) of latency
limited support for advanced event-time semantics
operational complexity in managing state
no ability to inspect or interact with raw streaming sources inside the UI

DeltaStream eliminates these limitations. DeltaStream uses Apache Flink as its core compute engine, powering true stream processing with continuous operators and sub-second latency. Pipelines run as long-lived streaming jobs, not periodic micro-batches.

Databricks users can now take advantage of:

Continuous event-time processing
Stateful stream operations at scale
Exactly-once semantics
True low-latency outputs, not micro-batch approximations

All directly from their existing Databricks workflows.

Continuously Updated Iceberg Tables (No More Staleness)

A major challenge with current Lakehouse streaming is data freshness. Even with aggressive micro-batching, data in your Lakehouse tables is always slightly stale.

The DeltaStream integration solves this by allowing you to sink the results of your Flink-powered pipelines directly into Lakehouse systems like Apache Iceberg. Because DeltaStream operates in true real-time, these Iceberg tables are continuously updated as new data is ingested.

Furthermore, DeltaStream automatically handles the maintenance and compaction of these tables.

The result? Databricks users can query these Iceberg tables with the confidence that the data is not stale by minutes or even seconds. It is truly up-to-date.

Real-Time Analytics with ClickHouse-Powered Materialized Views

For use cases requiring extremely fast queries, DeltaStream offers real-time materialized views built on ClickHouse:

Sub-second analytical queries
Perfect for monitoring, anomaly detection, dashboards, and alerting
Fully maintained in real time by Flink pipelines

This gives Databricks users a way to combine long-term lakehouse storage (Iceberg) with ultra-fast operational analytics.

The Best of Both Worlds: Databricks + DeltaStream(Flink + ClickHouse)

With this new integration, Databricks users gain a powerful combination:

Databricks for batch, ML, analytics, notebooks, governance
DeltaStream for real-time ingestion, true continuous stream processing along with sub-second real-time materialized views

This is the modern real-time data stack, now directly accessible inside the Databricks environment.

The era of compromising on latency to stay within the Lakehouse ecosystem is over. By integrating DeltaStream's Flink and ClickHouse-powered capabilities directly into Databricks, we are empowering data teams to build truly real-time applications without leaving their preferred environment. It’s time to move beyond micro-batches and embrace the speed of now.

Are you using Databricks? Connect with DeltaStream and bring your streaming data to life inside Databricks.

This blog was written by the author with assistance from AI to help with outlining, drafting, or editing.

Hojjat Jafarpour

Founder & CEO

Rick Curtis

Software Engineer

DeltaStream for Databricks: True Real-Time Streaming with Apache Flink, Now Inside the Databricks UI

Table of contents

Streaming & CDC Visibility, Right Inside Databricks

From Micro-Batch to True Continuous Streaming

Continuously Updated Iceberg Tables (No More Staleness)

Real-Time Analytics with ClickHouse-Powered Materialized Views

The Best of Both Worlds: Databricks + DeltaStream(Flink + ClickHouse)

7 Ways to Slash Your Snowflake Costs with DeltaStream

Noisy Neighbors Begone! DeltaStream’s Solution to Stream Processing Workload Isolation

Introducing DeltaStream’s New User Interface for Enhanced Stream Processing

Table of contents

Streaming & CDC Visibility, Right Inside Databricks

From Micro-Batch to True Continuous Streaming

Continuously Updated Iceberg Tables (No More Staleness)

Real-Time Analytics with ClickHouse-Powered Materialized Views

The Best of Both Worlds: Databricks + DeltaStream(Flink + ClickHouse)

7 Ways to Slash Your Snowflake Costs with DeltaStream

7 Ways to Slash Your Snowflake Costs with DeltaStream

Noisy Neighbors Begone! DeltaStream’s Solution to Stream Processing Workload Isolation

Noisy Neighbors Begone! DeltaStream’s Solution to Stream Processing Workload Isolation

Introducing DeltaStream’s New User Interface for Enhanced Stream Processing

Introducing DeltaStream’s New User Interface for Enhanced Stream Processing

Request Submitted

Share this blog post