Project Metamorphosis : dévoilement de la plateforme de streaming d'événements nouvelle générationEn savoir plus

ksqlDB 0.12.0 Introduces Real-Time Query Upgrades and Automatic Query Restarts

The ksqlDB team is pleased to announce ksqlDB 0.12.0. This release continues to improve upon the usability of ksqlDB and aims to reduce administration time. Highlights include query upgrades, which let you evolve queries even as they process events, and automatic restarts for persistent queries when they encounter errors.

Below, we’ll step through the most noteworthy changes. Check out the changelog for the complete list, including bug fixes and other enhancements.


Updating persistent queries with “CREATE OR REPLACE”

ksqlDB has many supported operations for persistent queries, including the ability to filter results with a WHERE clause. But what happens when you would like to change the filtering criteria? Previously, such a change would require terminating the query and recreating an updated version with the new criteria. With ksqlDB 0.12.0, you can now modify an existing query in place and not miss a beat in processing your streams.

WHERE clause expression upgrades

Let’s look at a motivating example. Imagine a query that reads from a stream of purchases made at ksqlDB’s fictional flagship store, ksqlMart, and filters out transactions that might be invalid:

CREATE STREAM purchases (product_id INT KEY, name VARCHAR, cost DOUBLE, quantity INT);
CREATE STREAM valid_purchases AS SELECT * FROM purchases WHERE cost > 0.00 AND quantity > 0;

Over time, ksqlMart changes its return policy and begins issuing full refunds. These events have a negative cost column value. Since these events are now valid, ksqlMart needs to update the query to remove the cost > 0.00 clause:

CREATE OR REPLACE STREAM valid_purchases AS SELECT * FROM purchases WHERE quantity > 0;

This CREATE OR REPLACE statement instructs ksqlDB to terminate the old query and create a new one with the new semantics. The new query will continue from the last event that the previous query processed. ksqlDB supports nearly all upgrades to WHERE clause expressions.

Schema upgrades

As time goes on, let’s imagine ksqlMart gets more sophisticated in their usage of Apache Kafka® to monitor their input. They start publishing a new field to the purchases stream, named popularity. In order to reflect this change in their valid_purchases stream, they need to issue two different commands:

CREATE OR REPLACE STREAM purchases (product_id INT KEY, name VARCHAR, cost DOUBLE, quantity INT, popularity DOUBLE); 
CREATE OR REPLACE STREAM valid_purchases AS SELECT * FROM purchases WHERE quantity > 0;

The first statement adds the field popularity to the stream purchases, while the second statement ensures that the SELECT * expression is reevaluated so that popularity is added to valid_purchases as well.

With these two powerful upgrade mechanisms, queries can adapt as your uses evolve, and, most importantly, your updated queries pick up right where they left off! For more details and examples on this feature, read the documentation.

Auto restarts for persistent queries

While processing data in a persistent query, the system can sometimes hit a condition that prevents the query from making progress, putting it into an ERROR state. Perhaps you’ve encountered this yourself due to transient networking issues or a Kafka ACL change, for example.

Historically, this scenario has required a restart of your server (and all the running persistent queries) to recover the one query that has exited with a transient error.

Starting with ksqlDB 0.12.0, the persistent queries will automatically be restarted and begin where they left off. This often reduces operational burden and saves time and resources because intervention isn’t required. More details can be found on GitHub.

Get started

Get started with ksqlDB today, via the standalone distribution or with Confluent Cloud, and join the community in our #ksqldb Confluent Community Slack channel.

Alan Sheinberg is a software engineer on the ksqlDB team where he focuses his efforts on improving functionality and performance of pull and push queries. Prior to joining Confluent, Alan worked in various areas from self-driving cars to ads at companies like Uber and Google.

Did you like this blog post? Share it now

Subscribe to the Confluent blog

More Articles Like This

Bounding ksqlDB Memory Usage

It’s important to be able to limit memory consumption of a running service. Use too much, and your application will crash and need to restart. If this happens too often,

How Real-Time Materialized Views Work with ksqlDB, Animated

All around the world, companies are asking the same question: What is happening right now? We are inundated with pieces of data that have a fragment of the answer. But

Restoring Balance to the Cluster: Self-Balancing Clusters in Confluent Platform 6.0

Apache Kafka® scales well. A Kafka cluster can grow to tens or hundreds of brokers and easily sustain tens of GB per second of read and write traffic. But scaling