Conquering All Your Stream Processing Needs with Kafka and Spark

Conquering All Your Stream Processing Needs with Kafka and Spark

On-demand recording

Kafka Summit 2016 | Systems Track

Apache Spark, specifically Spark Streaming, is becoming one of the most widely used stream processing system for Kafka. At its heart, Spark is an extremely fast and general-purpose distributed data processing platform. This allows the unification of all kinds of data processing using a single framework – streaming, SQL, and machine learning. For Kafka users, this means that they can use Spark to run batch jobs, streaming pipelines as well as interactive queries on Kafka data. In this talk, I am going to give a brief overview of the Spark framework and elaborate on how different components of Spark can be used to process data from Kafka. Specifically, I am going to cover the following.

  • Real-time processing of Kafka streams with Spark Streaming
  • Batch and interactive querying of Kafka data with Spark and Spark SQL
  • Schema-aware streaming ETL from with Streaming DataFrames


Tathagata Das, Software Engineer, Databricks

Nous utilisons des cookies afin de comprendre comment vous utilisez notre site et améliorer votre expérience. Cliquez ici pour en apprendre davantage ou pour modifier vos paramètres de cookies. En poursuivant la navigation, vous consentez à ce que nous utilisions des cookies.