The Road to Raising DeltaStream’s Series A

DeltaStream has secured $15M Series A funding

Today, I’m excited to announce DeltaStream has secured $15M Series A funding from New Enterprise Associates(NEA), Galaxy Interactive, and Sanabil Investment. This brings our total raised to $25M and will accelerate DeltaStream’s vision of providing a serverless stream processing platform to manage, secure and process all streaming data. 

The Beginnings of Our Story

I joined Confluent in 2016 because I was given the opportunity to build a SQL layer on top of Kafka so users could build streaming applications in SQL. ksqlDB was the product we created, and it was one of the first SQL processing layers on top of Apache Kafka. While ksqlDB was a significant first step, it had limitations. Including being too tightly coupled with Kafka, only working with one Kafka cluster, and creating a lot of network traffic on Kafka cluster. 

The need for a next-generation stream processing platform was obvious. It had to be a completely new platform from the ground up, so it made sense to start from scratch. Starting DeltaStream was the beginning of a journey to revolutionize the way organizations manage and process streaming data. The challenges we set out to solve for were:

  1. Easy of use: Writing SQL queries is all a user should worry about
  2. Have a single data layer to process/analyze all streaming and batch data, including, for example, data from Kafka, Kinesis, Postgres, and Snowflake
  3. Standardize and authorize access to all data
  4. Enable high-scale and resiliency
  5. Flexible deployment models

Building DeltaStream

At DeltaStream, Apache Flink is our processing/computing engine. Apache Flink has emerged as the gold standard platform for stream processing with proven capabilities and a large and vibrant community. It’s a foundational piece of our platform, but there’s much more. Here is how we solved the challenges outlined above:

Ease of use

We have abstracted away the complexities of running Apache Flink and made it serverless. Users don’t have to think about infrastructure and can instead focus on writing queries. DeltaStream handles all the operations, including fault tolerance and elasticity.

Single Data Layer

DeltaStream can read across many modern streaming stores, databases, and data lakes. We then organize this data into a logical hierarchy, making it easy to analyze and process the underlying data. 

Standardize Access

We built a governance layer to manage access through fine-grain permissions across all data rather than across disparate data stores. For example, you would manage access to data in your Kafka clusters, Kinesis streams all within DeltaStream.

Enable High Scale and Resiliency

Each DeltaStream query is run in isolation, eliminating the “noisy neighbor” problem. Queries can be scaled up/down independently.

Flexible Deployment Models

In addition to our cloud service, we provide BYOC for companies that want more control of their data. This is essential for highly regulated industries and companies with strict data security policies. 

Also, with DeltaStream, we wanted to go beyond Flink and provide a full suite of analytics by enabling users to build real-time materialized views with sub-second latency. 

What’s next for DeltaStream

We’re just getting started. Here are a few things we’re planning:

  • Increase the number of stores we can read and write to. This includes platforms such as Apache Iceberg and Clickhouse.
  • Increasing the number of clients/adaptors we support, including dbt and Python.
  • Multi-Cloud
  • Leverage AI to enable users with no SQL knowledge to interact with DeltaStream

If you are a streaming and real-time data enthusiast and would like to help build the future of streaming data, please reach out to us; we are hiring for engineering and GTM roles!

If you want to experience how DeltaStream enables users to maximize the value of their streaming data, try it for yourself by heading to deltastream.io.

Finally, I would like to thank our customers, community, team, and partners—including our investors—for their unwavering support. Together, we are making stream processing a reality for organizations of all sizes.

17 Sep 2024

Min Read

DeltaStream Raises $15M in Series A Funding to Deliver the Future of Real-Time Stream Processing in Cloud

DeltaStream, Inc., the serverless stream processing platform, today announced it has raised $15M in Series A funding from New Enterprise Associates (NEA), Galaxy Interactive and Sanabil Investments. The funding will accelerate the company’s vision of enabling a complete serverless stream processing platform to manage, secure and process all streaming data, irrespective of data source or pipeline.

Event streaming platforms such as Apache Kafka have become essential in today’s data-driven industries, with many sectors adopting real-time data streaming. As AI  advances, real-time data is becoming even more critical for  applications.

DeltaStream, founded by Hojjat Jafarpour, CEO and creator of ksqlDB, helps  organizations build real-time streaming applications and pipelines with SQL in minutes. The  platform leverages the power of Apache Flink© while simplifying its complexity, making it accessible to businesses of all sizes–from startups to large enterprises. DeltaStream is available as a fully managed service or a bring-your-own-cloud (BYOC) option.

“Streaming data is essential for modern apps, but it’s been difficult and costly to get value from it. DeltaStream’s serverless platform simplifies infrastructure management, allowing users to focus on building their applications,” said Hojjat Jafarpour. “To fully benefit from real-time data, customers need an organized view across all data stores, role-based access control, and secure sharing of real-time data. What Databricks and Snowflake did for stored data, DeltaStream does for streaming data.”

DeltaStream also integrates with Databricks and Snowflake, letting customers create real-time pipelines that quickly move data from streaming platforms like Apache Kafka to these systems.

“We’re seeing rapid adoption of streaming data and the rise of platforms such as Apache Flink©,” NEA partner and DeltaStream board member Aaron Jacobson explained. “However, one of the main challenges of using such systems has been their operational complexity. With DeltaStream, users have the power of Apache Flink© without having to deal with its complexity resulting in significantly cost effective and accelerated time to market for real-time data applications.”

“DeltaStream is leading the charge in real-time streaming, where speed, low latency and intelligent decision-making are critical for businesses to maintain a competitive edge.” said Jeff Brown, Galaxy Interactive Partner, “Their enterprise grade, secure, and scalable solution simplifies complex stream processing, allowing teams to focus on deriving insights rather than managing infrastructure.”

About DeltaStream:

DeltaStream’s innovative stream processing platform harnesses the power of Apache Flink© to simplify and easily process real-time data. Furthermore, the platform provides governance, organization and secure sharing capabilities for streaming data across all streaming storage platforms including Apache Kafka, Apache Pulsar AWS Kinesis and many more. In addition to its SaaS offering, DeltaStream’s platform is also available in private SaaS(also known as Bring Your Own Cloud) deployment to address the needs of regulated industries with high data privacy and security requirements. DeltaStream’s platform seamlessly integrates with both Databricks and Snowflake platforms enabling customers to build real-time data pipelines to make data available in Databricks and Snowflake seconds after it is available in Streaming platforms such as Apache Kafka.
https://www.deltastream.io/

DeltaStream is exhibiting this week at the Current Conference in Austin, Texas. 

14 May 2024

Min Read

DeltaStream Joins the Connect with Confluent Partner Program

We’re excited to share that DeltaStream has joined the Connect with Confluent technology partner program.

Why this partnership matters

Confluent is a leader in streaming data technology, used by many industry professionals. This collaboration enables organizations to process and organize their Confluent Cloud data streams easily and efficiently from within DeltaStream. This breaks down silos and opens up powerful insights into your streaming data, the way it should be.

Build real-time streaming applications with DeltaStream

DeltaStream is a fully managed stream processing platform that enables users to deploy streaming applications in minutes, using simple SQL statements. By integrating with Confluent Cloud and other streaming storage systems, DeltaStream users can easily process and organize their streaming data in Confluent or wherever else their data may live. Powered by Apache Flink, DeltaStream users can get the processing capabilities of Flink without any of the overhead it comes with.

Unified view over multiple streaming stores

DeltaStream enables you to have a single view into all your streaming data across all your streaming stores. Whether you are using one Kafka cluster, multiple Kafka clusters, or multiple platforms like Kinesis and Confluent, DeltaStream provides a unified view of the streaming data and you can write queries on these streams regardless of where they are stored.

Break down silos with secure sharing

With the namespacing, storage abstraction and role based access control, DeltaStream breaks down silos for your streaming data and enables you to share streaming data securely across multiple teams in your organizations. With all your Confluent data connected into DeltaStream, data governance becomes easy and manageable.

How to configure the Confluent connector

While we have always supported integration with Kafka and continue to do so, we have now simplified the process for integrating with Confluent Cloud by adding a specific “Confluent” Store type. To configure access to Confluent Cloud within DeltaStream, users can simply choose “Confluent” as the Store type while defining their Store. Once the Store is defined, users will be able to share, process, and govern their Confluent Cloud and other streaming data within DeltaStream.

To learn how to create a Confluent Cloud Store, either follow this tutorial or watch the video below.

Getting Started

To get started with DeltaStream, schedule a demo with us or sign up for a free trial. You can also learn more about the latest features and use cases on our blogs page.

09 Jan 2024

Min Read

Bundle Queries with Applications by DeltaStream

Stream processing has turned into an essential part of modern data management solutions. It provides real-time insights which enable organizations to make informed decisions in a timely manner. Stream processing workloads are often complex to write and expensive to run. This is due to the high volumes of data that are constantly flowing into these workloads and the need for the results to be produced with minimal delay.

In the data streaming world, it’s common to think about your stream processing workload as pipelines. For instance, you may have a stream processing job ingest from one data stream, process the data, then write the results to another data stream. Then, another query will ingest the results from the first query, do further processing, and write another set of results to a third stream. Depending on the use case, this pattern of reading, processing, and writing continues until eventually you end up with the desired set of data. However, these intermediate streams may not be needed for anything other than being ingested by the next query in the pipeline. Reading, storing, and writing these intermediate results costs money in the form of network I/O and storage. For SQL-based stream processing platforms, one solution is to write nested queries or queries containing common table expressions (CTEs), but for multi-step pipelines, queries written in this way can become overly complex and hard to reason through. Furthermore, it may not even be possible to use nested queries or CTEs to represent some use cases, in which case materializing the results and running multiple stream processing jobs is necessary.

To make stream processing simpler, more efficient, and more cost effective, we wanted to have the capability of combining multiple statements together. We did this by creating a new feature called Applications, which has the following benefits:

  • Simplify the workload: Users can simplify a complex computation logic by dividing it into several statements without additional costs. This helps break down a complicated workload into multiple steps to improve readability of the computation logic and reusability of results. Smaller, distinct statements will also facilitate debugging by isolating the processing steps.
  • Reduce the load on streaming Stores: An Application in DeltaStream optimizes I/O operations on streaming Stores in two ways. First, the Application will only read from a unique source Relation’s topic once. This reduces the read operations overhead when multiple queries in the Application consume records from the same Relation. Second, users can eliminate the reads/writes from intermediate queries by specifying “Virtual Relations” in the Application. “Virtual Streams” and “Virtual Changelogs” are similar to regular Streams and Changelogs in DeltaStream, but they are not backed by any physical streaming Store. Instead, Virtual Relations are for intermediate results and other statements in the Application are free to read/write to them.
  • Reduce overall execution cost and latency: All statements in an Application run within a single runtime job. This not only reduces the overall execution cost by minimizing the total number of jobs needed for a workload, but also enhances resource utilization. Packing several statements together facilitates efficient resource sharing and lowers scheduling overhead for the shared resources. Additionally, the optimized I/O operations on streaming Stores (as previously mentioned) along with less network traffic from/to those Stores contribute to the overall cost and latency reduction.

Simplifying Data Workloads with Applications

Let’s go over an example to show how Applications can help users write a workload in a simpler and more efficient manner.

Assume we are managing an online retail store and we are interested in extracting insights on how users visit different pages on our website. There are two Kafka Stores/clusters, one in the US East region and one in the US West to store page views in each region. Registered users’ information is also stored separately in the US West Store. We have the following Relations defined on topics from these Stores:

  • “pageviews_east” and “pageviews_west” are two Streams defined on the topics in the US East and US West stores, respectively
  • “users_log” is a Changelog defined on the users’ information topic in the US west Store, using the “userid” column as the primary key

You can find more details about Stores, Streams and Changelogs in DeltaStream and how to create them here.

Our online advertisement team is curious to find out which product pages are popular as users are browsing the website. A page is popular if it is visited by at least 3 different female users from California in a short duration. Using the three Relations we defined above, we’ll introduce a solution to find popular pages without using an Application,then compare that with an approach that uses an Application.

“No Application” Solution

One way to find popular pages is by writing 4 separate queries: 

  • Query 1 & Query 2: Combine pageview records from the “pageviews_west” and “pageviews_east” Streams into a single relation. The resulting Stream is called “combined_pageviews”.
  • Query 3: Join “combined_pageviews” records with records from “users_log” to enrich each pageviews record with its user’s latest information. The resulting Stream is called “enriched_pageviews”.
  • Query 4: Group records in “enriched_pageviews” by their “pageid” column, aggregate their views, and find those pages that meet our popular page criteria.    

Figure 1 shows how the data flows between the different Relations (shown as rounded boxes) and the queries (shown as colored boxes). Each query results in a separate runtime job and requires its own dedicated resources to run. The dashed arrows between the Relations and the Stores indicate read and write operations against the Kafka topics backing each of the Relations. Moreover, given that each Relation is backed by a topic, all records are written into a persistent Store, including records in the “combined_pageviews” and “enriched_pageviews” Streams.

Solution with Application

Ideally, we are interested in reducing the cost of running our workload without modifying its computation logic. In the “No Application” solution above, while the records for “combined_pageviews” and “enriched_pageviews” are persisted, we don’t really need them outside the Application context. They are intermediate results, only computed to prepare the data for finding popular pages. “Virtual Relations” can help us skip materializing intermediate query results.

We can write an Application with Virtual Relations to find popular pages, as shown below. This Application creates a single long-running job and helps reduce costs in two ways:

  1. By using Virtual Relations, we can avoid the extra cost of writing intermediate results into a Store and reading them again. This will reduce the network traffic and the read/write load on our streaming Stores.
  2. By packing several queries into a single runtime job, we can use our available resources more efficiently. This will reduce the number of streaming jobs we end up creating in order to run our computation logic.

Here is the Application code for our example:

  1. BEGIN APPLICATION popular_pages_app
  2.  
  3. -- statement 1
  4. CREATE VIRTUAL STREAM virtual.public.combined_pageviews AS
  5. SELECT *
  6. FROM pageviews_east;
  7.  
  8. -- statement 2
  9. INSERT INTO virtual.public.combined_pageviews
  10. SELECT *
  11. FROM pageviews_west;
  12.  
  13. -- statement 3
  14. CREATE VIRTUAL STREAM virtual.public.enriched_pageviews AS
  15. SELECT v.userid, v.pageid,
  16. u.gender, u.contactinfo->`state` AS user_state
  17. FROM virtual.public.combined_pageviews v JOIN users_log u
  18. ON u.userid = v.userid;
  19.  
  20. -- statement 4
  21. CREATE CHANGELOG popular_pages WITH ('store'='us_west_kafka', 'topic.partitions'=1, 'topic.replicas'=3) AS
  22. SELECT pageid, count(DISTINCT userid) AS cnt
  23. FROM virtual.public.enriched_pageviews
  24. WHERE (UNIX_TIMESTAMP() - (rowtime()/1000) < 30) AND
  25. gender = 'FEMALE' AND user_state = 'CA'
  26. GROUP BY pageid
  27. HAVING count(DISTINCT userid) > 2;
  28.  
  29. END APPLICATION;

There are 4 statements in this Application. They look similar to the 4 queries used in the “no Application” solution above. However, the “combined_pageviews” and “enriched_pageviews” Streams are defined as Virtual Relations in the Application.

Figure 2 illustrates how the data flows between the different statements and Relations in the Application. Compared to Figure 1, note that the “combined_pageviews” and “enriched_pageviews” Virtual Streams (shown in white boxes) do not have dashed arrows leading to a Kafka topic in the storage layer. This is because Virtual Relations are not backed by physical storage, and thus reads and writes to the Store for these Virtual Relations are eliminated, reducing I/O and storage costs. In addition, the 4 queries in Figure 1 generate 4 separate streaming jobs, whereas all processing happens within a single runtime job in the solution using an Application.

Platform for Efficient Stream Processing

In this blog, we compared two solutions for a stream processing use case, one with an Application and one without. We showed how Applications and Virtual Relations can be used to run workloads more efficiently, resulting in reduced costs. The SQL syntax for Application helps users simplify their complex computation logic by breaking it into several statements, and allows reusing intermediate results with no additional cost. Stay tuned for more content on Application in the future, where we’ll dive more deeply into the individual benefits of running Applications and include more detailed examples.

DeltaStream is easy to use, easy to operate, and scales automatically. If you are ready to try a modern stream processing solution, you can reach out to our team to schedule a demo or start your free trial.

21 Dec 2023

Min Read

DeltaStream: A Year of Innovation and Growth

2023 has been a remarkable year for DeltaStream, filled with innovation, growth, and resilience, it has been nothing short of transformative. A year of significant advancements in our mission to make stream processing accessible and powerful for everyone.

We wanted to take a moment to highlight the key achievements from this year that have propelled us forward:

Expanding Expertise

Previously focused solely on engineering, we welcomed a dedicated Go-To-Market team in 2023, leading to increased content creation through engaging blog posts, webinars, and video tutorials. This expansion reflects our commitment to broader engagement and community building through education.

Building a Robust Platform

2023 saw the release of many exciting capabilities by DeltaStream. We launched a public SaaS offering, where both control and data planes reside within DeltaStream’s VPC. While this solution serves many, we recognize the need for stricter data management. In response, we built our private SaaS offering, Bring Your Own Cloud (BYOC). With this option, DeltaStream runs the data plane within the user’s own VPC, ensuring data sovereignty. Announced in August 2023, BYOC has received enthusiastic feedback from customers and prospects.

Deepening Ecosystem Integration

Through user feedback, we identified Snowflake and Databricks as popular destinations for streaming data. We responded by establishing seamless integrations within DeltaStream, allowing users to build streaming pipelines and materialize results in either platform.

Open Source Contributions

Recognizing the power of collaboration, we open-sourced our Snowflake connector for Apache Flink. This connector facilitates native integration between other data sources and Snowflake. Open sourcing this connector aligns with our vision of providing a unified view over all data and to make stream processing possible for any product use case.

Streamlining Change Data Capture

DeltaStream now supports Change Data Capture (CDC) for PostgreSQL, enabling real-time data integration and consistent updates across systems.

Free Trial Availability

We understand the importance of hands-on experience. This year, we launched a free trial that allows users to explore DeltaStream’s features and functionalities firsthand. Additionally, we launched a click-through experience so users are able to experience our first-in-class UI without registration.

Achieving SOC 2 Compliance

Demonstrating our commitment to security, DeltaStream achieved SOC 2 Type II compliance late Q2 2023. This certification was important for us to show our commitment to protecting customer data, ensuring operational excellence, and maintaining a secure environment.

Engaging the Community

We actively participated in various conferences and events, including Current 2023, Flink Forward, and numerous Data Infrastructure gatherings. We sponsored Current 2023, hosted networking events, presented a session on securing streaming data, and showcased DeltaStream to the broader community.

Looking Ahead

2023 has been a year of tremendous growth and innovation for DeltaStream. We have achieved significant milestones and established ourselves as a leading provider of stream processing solutions. As we look forward to 2024 and beyond, we are committed to continuous development and community engagement. We believe that stream processing has the power to revolutionize data-driven decision-making, and we are excited to be at the forefront of this transformation.

Thank you to our customers, community, and partners for your unwavering support. Together, we are making stream processing a reality for organizations of all sizes.

07 Nov 2023

Min Read

Open Sourcing our Apache Flink + Snowflake Connector

At DeltaStream our mission is to bring a serverless and unified view of all streams to make stream processing possible for any product use case. By using Apache Flink as our underlying processing engine, we have been able to leverage its rich connector ecosystem to connect to many different data systems, breaking down the barriers of siloed data. As we mentioned in our Building Upon Apache Flink for Better Stream Processing article, using Apache Flink is more than using a robust software with a good track record at DeltaStream. Using Flink has allowed us to iterate faster on improvements or issues that arise from solving the latest and greatest data engineering challenges. However, one connector that was missing until today was the Snowflake connector.

Today, in our efforts to make solving data challenges possible, we are open sourcing our Apache Flink sink connector built for writing data to Snowflake. This connector has already provided DeltaStream with native integration between other sources of data and Snowflake. This also aligns well with our vision of providing a unified view over all data, and we want to open this project up for public use and contribution so that others in the Flink community can benefit from this connector as well.

The open source repository will be open for any contributions, suggestions, or discussions. In this article, we touch on some of the highlights around this new Flink connector.

Utilizing the Snowflake Sink

The Flink connector uses the latest Flink Sink<InputT> and SinkWriter<InputT> interfaces to build a Snowflake sink connector and write data to a configurable Snowflake table, respectively:

Diagram 1: Each SnowflakeSinkWriter inserts rows into Snowflake table using their own dedicated ingest channel

The Snowflake sink connector can be configured with a parallelism of more than 1, where each task relies on the order of data in which they receive from their upstream operator. For example, the following shows how data can be written with parallelism of 3:

  1.  
  2. DataStream<InputT>.sinkTo(SnowflakeSinkWriter<InputT>).setParallelism(3);

Diagram 1 shows the flow of data between TaskManager(s) and the destination Snowflake table. The diagram is heavily simplified to focus on the concrete SnowflakeSinkWriter<InputT>, and it shows that each sink task connects to its Snowflake table using a dedicated SnowflakeStreamingIngestChannel from Snowpipe Streaming APIs.

The SnowflakeSink<InputT> is also shipped with a generic SnowflakeRowSerializationSchema<T> interface that allows each implementation of the sink to provide its own concrete serialization to a Snowflake row of Map<String, Object> based on a given use case.

Write Records At Least Once

The first version of the Snowflake sink can write data into Snowflake tables with the delivery guarantee of NONE or AT_LEAST_ONCE, using AT_LEAST_ONCE by default. Supporting EXACTLY_ONCE semantics is a goal for a future version of this connector.

The sink writes data into its destination table after buffering records for a fixed time interval. This buffering time interval is also bounded by Flink’s checkpointing interval, which is configured as part of the StreamExecutionEnvironment. In other words, if Flink’s checkpointing interval and buffering time are configured to be different values, then records are flushed as fast as the shorter interval:

  1.  
  2. StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
  3. env.enableCheckpointing(100L);
  4. SnowflakeSink<Map<String, Object>> sf_sink = SnowflakeSink.<Row>builder()
  5. .bufferTimeMillis(1000L)
  6. .build(jobId);
  7. env.fromSequence(1, 10).map(new SfRowMapFunction()).sinkTo(sf_sink);
  8. env.execute();

In this example, checkpointing interval is set to 100 milliseconds and buffering interval is configured as 1 second, which tells the Flink job to flush the records at least every 100 milliseconds, i.e. on every checkpoint.

Read more about Snowpipe Streaming best practices in the Snowflake documentation.

The Flink Community, to Infinity and Beyond

We are very excited about the opportunity to contribute our Snowflake connector to the Flink community. We’re hoping that this connector will add more value to the rich connector ecosystem of Flink that’s powering many data application use cases.

If you want to check out the connector for yourself, head over to the GitHub repository, or if you want to learn more about DeltaStream’s integration with Snowflake, read our Snowflake integration blog.

01 Aug 2023

Min Read

Private SaaS General Availability and Free Trial Opportunity for DeltaStream

Throughout the first half of 2023 DeltaStream has made strides exiting stealth and emerging in the real-time streaming market with the  general availability of Private SaaS, also known as “Bring Your Own Cloud” (BYOC) and a free trial version of our platform. With the rising volume of real-time data use, streaming data customers can increase their data processing efficiency to better address mission-critical business challenges.

The general availability of DeltaStream with Private SaaS offers a significant advantage to our customers to offload operations while ensuring security of their data. This move will provide our customers with the opportunity to scale their own infrastructure while maintaining data security requirements. DeltaStream will operate the data plane within a customer’s VPC and will manage all aspects of the operations.

“It has been a great year so far for DeltaStream,” Hojjat Jafarpour, founder of DeltaStream said, “we’ve been engaging with the data streaming community and showing how DeltaStream can improve their real-time data experience with our free trial.” As for BYOC, Jafarpour had this to say, “We believe a Private SaaS option is vital for our customers moving forward, security and efficiency is a top priority for companies handling data and it is a top priority for us.”

Key H1 Announcements:

  • Launch of New User Interface
    • In Q2 DeltaStream unveiled an improved UI for the DeltaStream platform. Working with a decorated UI designer, the updated interface prioritizes human-centric design and usability.
  • SOC 2 Compliance
    • DeltaStream achieved SOC 2 Type II compliance late Q2 2023. This third-party industry validation demonstrates DeltaStream’s commitment to security.
  • Attendance at RTA Summit, Databricks Summit
    • Throughout 2023, DeltaStream has attended industry shows and conferences, including sponsoring the RTA Summit – meeting data-enthusiasts, learning from speakers, and showing off the DeltaStream platform.
  • Hosted First Webinar
    • In June DeltaStream successfully hosted their first webinar discussing Streaming Analytics and Streaming Databases. Hojjat Jafarpour, founder of DeltaStream and creator of KSQL, hosted the event to an engaged audience.
  • Launch of New Website
    • Following their first marketing hire in early Q1, DeltaStream unveiled a new website which included an ongoing series of product content and updated look and feel.

20 Jun 2023

Min Read

DeltaStream announces SOC 2 Type 2 Compliance

DeltaStream is excited to announce today that we have achieved SOC 2 Type II compliance in accordance with American Institute of Certified Public Accountants (AICPA) standards for SOC for Service Organizations. Achieving this standard with an unqualified opinion serves as third-party industry validation that DeltaStream Inc provides enterprise-level security for customer’s data secured in our system.

Achieving SOC 2 compliance shows our commitment to protecting customer data, ensuring operational excellence, and maintaining a secure environment.  To current and future customers – we are committed to managing data with the highest standard of security and compliance.  

See our security page for more.

07 Mar 2023

Min Read

Introducing DeltaStream 2023

We created DeltaStream because I saw the complexity, lack of features and cost of stream processing solutions, and knew there was a better way.

For the last 2 years we’ve been working diligently to bring DeltaStream to life.

DeltaStream is a unified serverless stream processing platform to manage, secure and process all your event streams and is based on Apache Flink. We built DeltaStream to provide a comprehensive stream processing platform that is easy to use, easy to operate, and scales automatically.

Given the current economic environment, the buyer increasingly is demanding solutions that solve key business problems and provide a tangible ROI without runaway costs. DeltaStream does this. I see the opportunities for DeltaStream to enable businesses to quickly derive insights from real-time data, while alleviating the engineering toil of managing infrastructure.

Our Platform

Unified: Addressing the fragmented stream processing space.

We built a platform that

  • Works across multiple data streaming stores (i.e; Apache Kafka and Kinesis).
  • Work across multiple Kafka clusters or Kinesis Data streams.
  • Is complete – as it enables both stateless and stateful stream processing-from joins to materialized views.

Serverless and Scale: Automating infrastructure and scale

We made a platform that was easy to use, scale and maintain

  • Zero operations – Users can get started in minutes and deploy apps on Day 1
  • Multi-tenancy – With Query isolation scaling existing use cases or onboarding new ones is no longer a concern.
  • Scale on Demand – Our serverless architecture automatically scales up/down as needed

Secure: Access and Sharing

We developed a platform with a security story for “Data in Motion” that is on par with “Data at Rest”

  • Access Control – Our RBAC allows granular control over event streams
  • Namespacing – Create logical boundaries to organize and manage your real-time data.
  • Sharing and Governance: Leveraging RBAC + Namespacing to securely share event streams in real-time not only within the organization, but also with 3rd party users.

Our team continues to grow and further expand our platform. We’re excited by what we can accomplish in 2023 and beyond. If you’re going to be at RTA Summit in April, I hope you’ll come see me speak or find time to meet.

Please reach out to us for a demo of Deltastream, I am excited to share it with you.

11 Jul 2022

Min Read

Introducing DeltaStream and announcing our $10M seed funding and private beta availability.

Imagine you are a financial institution and are receiving a stream of credit card transactions that your customers are performing anytime, anywhere. You need to process the transactions and detect if any of them is fraudulent and if so block the fraudulent transaction. Timeliness of such processing is essential and you cannot rely on nightly or even hourly jobs to perform the processing and detect the fraud. By the time a periodic batch processing job detects a fraudulent transaction, the transaction has been approved and your institution has suffered the loss. However, by employing stream processing, you can detect and respond to such fraud scenarios with sub second latency and prevent substantial financial loss for your institution. This is just one example how accessing fresh, low latency data can provide a huge competitive advantage to enterprises. You can see similar use cases in areas like banking, Internet of Things, retail, IT, gaming, health care, manufacturing and many more. Customers and users demand low latency services and organizations that provide such service gain significant competitive advantage.

Stream processing, then and now

When I joined Confluent and started ksqlDB (formerly known as KSQL) project, streaming storage platforms such as Apache Kafka were mainly used by tech savvy Silicon Valley companies. Most of the data was at rest and people were reluctant to deal with the complexity of introducing streaming in their architecture. Fast forward six years to 2022 and event streaming systems such as Apache Kafka are one of the main components of modern data infrastructure. Many enterprises have adopted or are in the process of adopting event streaming platforms as the central nervous system of their data infrastructure. Furthermore, availability of such systems as managed services on cloud has made adoption even more compelling. Confluent Cloud, AWS Kinesis, Azure Event hub and GCP Pub/Sub are just a few of such streaming storage services that are available on cloud.

Adoption of such streaming storage services also resulted in many applications to be built on top of such platforms. Such applications enabled the processing and reacting to events in sub second latency which in turn resulted in enormous financial gain for enterprises. An online retail business can increase its revenue by analyzing customer behavior in real-time and recommending the right products while the user is shopping online. Such analysis cannot be done in batch mode since by the time the result is available, the customer has left the online store. Such recommendation applications along with many others such as streaming pipelines, anomaly detection, customer 360, click stream analysis, inventory logistics and log aggregation are just a few of applications that are built on top of platforms like Apache Kafka. However, building real-time streaming applications has been a challenging endeavor requiring highly skilled teams of developers in distributed systems and data management. Delivery guarantees, fault tolerance, elasticity and security are just a few of many challenges that make building real-time streaming applications out of reach for many organizations. Furthermore, even if organizations overcome these challenges and build real-time applications, operating such applications 24/7 in reliable, secure and scalable ways is a huge burden on the teams.

Enter DeltaStream

DeltaStream is a serverless stream processing platform to manage, secure and process all your streams on cloud. We built DeltaStream to take away the complexity from building and operating scalable, secure and reliable real-time streaming applications and make it as easy, fast and accessible as possible. To achieve this goal we are bringing the tried and true benefits of relational data management to the streaming world. Relational databases have successfully been used to manage and process data at rest for the last few decades and have played a crucial role in democratizing access to data in organizations. In addition to processing capabilities, these systems also provide capabilities to organize and secure data in a familiar way. With DeltaStream, we bring not just familiar processing capabilities to the streaming world, but also provide similar ways to manage and secure streaming data, a uniue differentiator capability in DeltaStream.

The following are some of the principals that DeltaStream has built upon and make it a unique service:

  • DeltaStream is serverless: this means as a developer, data engineer or anyone who interacts with real-time streaming data, you don’t have to provision, scale or maintain servers or clusters to build and run your real-time applications. No need to decide how big of a deployment or how many tasks to allocate to your applications. DeltaStream takes care of all those complexities so you can focus on building your core products that bring value to you and your customers instead of worrying about managing and operating distributed stream processing infrastructure. Indeed, there is no notion of cluster or deployment in DeltaStream and you can assume unlimited resources available for your applications while you are building them. You only pay for what you use and DeltaStream seamlessly scales up or down your applications as needed and recovers them from any failures.
  • Embrace SQL database model: for the past few decades, SQL databases have proven to be a great way to manage and process data. Simplicity and ubiquity of SQL has made it easy to query and access the data. However, many real-time streaming systems either do not utilize these capabilities or only try to use them partially for expressing processing logic with SQL and ignore other capabilities such as managing and securing access to the data. DeltaStream brings all the capabilities you know and are used to in the SQL databases for data at rest to the streaming world.
    • SQL: DeltaStream enables users to easily build real-time applications and pipelines with familiar SQL dialect. From simple stateless processing such as filtering and projections to complex stateful processing such as joins and aggregations can be done in DeltaStream with a few lines of SQL. DeltaStream seamlessly provides desired delivery guarantees (exactly once or at least once), automatic checkpointing and savepointing for elasticity and fault tolerance.
    • Organizing your data in motion: Similar to SQL databases, DeltaStream enables you to organize your streaming data in databases and schemas. A database is a logical grouping of schemas and a schema is a logical grouping of objects such as streams, tables and views. This is the basis of namespacing in DeltaStream and enables our users to organize their data much more effectively compared to a flat namespace.
    • Securing your data in motion: securing your data is one of the foundational features in DeltaStream. In addition to common security practices such as data security, authentication and authorization, DeltaStream provides the familiar Role- based Access Control(RBAC) model from SQL databases that enables users to control who can access data and what operations they can perform with data. Users can define roles and grant or revoke privileges the same way they do in other SQL databases. Combination of DeltaStream’s namespacing and security capabilities provide a powerful tool for the users to secure their data in motion.
  • Separation of Compute and Storage: DeltaStream architecture separates the compute and storage resulting in the well known benefits such as elasticity, cost efficiency and high availability. Additionally, DeltaStream’s model of providing the compute layer on top of users’ streaming storage systems, such as Apache Kafka or AWS Kinesis, eliminates the need for data duplication and doesn’t add unnecessary latency to the real-time applications and pipelines. DeltaStream also is agnostic to the underlying storage service and can read from and write into data in motion storage services such as Apache Kafka or AWS Kinesis and data at rest storage services such as AWS S3 or DataLakes. Such flexibility gives DeltaStream the capability to provide an abstraction layer on top of many storage services where users can read data from one or more services, perform desired computation and write the results in one or more storage services seamlessly.

As a cloud service, DeltaStream provides a REST API with GraphQL. There are three ways of using it today.

  • Web App: DeltaStream web app provides a browser-based application that users can interact with the service
  • Command Line Interface(CLI): Users can also use the DeltaStream cli application to interact with the service through their terminal. The CLI provides all the functionalities that the Web App provides.
  • Direct access to Rest API: Users can directly access the service through the provided Rest API. This enables users to integrate DeltaStream into their applications or their CI/CD pipelines.

Currently, DeltaStream is available on AWS cloud and we plan to offer it on GCP and Azure soon.

What’s Next?

Today, we are excited to announce we have raised a $10M seed round led by New Enterprise Associates (NEA). This funding will allow us to speed ahead in providing DeltaStream. We are also announcing the availability of DeltaStream’s Private Beta. If you are on AWS and use any event streaming service such as Confluent Cloud, Redpanda, AWS MSK or AWS Kinesis, please consider joining our Private Beta program to try DeltaStream and help shape our product roadmap. The Private Beta is Free and we would only ask for your feedback in using DeltaStream. To join our Private Beta program, please fill out the form here and tell us about your use cases.

We are also expanding our team and hiring for different roles. If you are passionate about building streaming systems, drop us a line and let’s chat.

alert-icon

Please enter a valid email address.

Request Submitted

Thank you for requesting a demo.
You will receive your login information to your email soon.