Log Compaction | Kafka Summit Edition | May 2016
Log Compaction

Log Compaction | Kafka Summit Edition | May 2016

Gwen Shapira

Last week, Confluent hosted Kafka Summit, the first ever conference to focus on Apache Kafka and stream processing. It was exciting to see the stream processing community coming together in one event to share their work and discuss possible improvements. The conference sold out several weeks in advance and over 550 Kafka enthusiasts attended.

The sessions overall were well received thanks to all of the speakers that put in time and effort to contribute to the high quality of the conference – a special thanks to the speakers! I’d like to highlight few of the sessions and discussions that attendees were especially excited about.

Hacking on Kafka Connect and Kafka Streams

On the Monday evening before the the conference we held a Stream Data Hackathon. The room was packed with over 100 participants hacking away on experimental stream processing projects. There were many awesome projects and we will publish a separate blog post to share all of them. The winning projects combined creativity and usefulness:

  • Real-time sentiment analysis of tweets, used to evaluate and visualize how twitter collectively feels about the presidential candidates in the US. Both Kafka Connect and Kafka Streams were used to implement this project. The project is by Ashish Singh from Cloudera.
  • Measure electrical activity from the brain using a bluetooth device and using Kafka to stream the data to OpenTSDB and visualizing it with Grafana. The project is by a team from Silicon Valley Data Science.
  • Kafka Connector for streaming events from Jenkins to Kafka, in order to collect all the events regarding Jenkins Jobs in an organization to one central location. The project is by Aravind Yarram from Equifax.

Kafka Summit SF 2016

Keynote Sessions

The next day opened with a gourmet breakfast, immediately followed by three keynote talks. Neha Narkhede gave a wonderful overview of the growth of the Apache Kafka project and community since she and the other Kafka co-creators (Jay Kreps, Jun Rao, and others) started the project at LinkedIn. Then Jay Kreps shared his thoughts on the future of stream processing and how this new paradigm will change the way companies use data. Last (but not least) Aaron Schildkrout, Uber’s head of data and marketing (I love this title) discussed the ways his company uses Kafka and how their use cases are evolving. It’s pretty inspiring to think of drivers getting real-time feedback on how their driving from their phones.

Breakout Sessions

After the keynote session, we headed to the 28 breakout sessions across three tracks:

  • Systems Track – focused on stream processing
  • Operations Track – how to run Kafka in production
  • Users Track – use cases and architectures

After the conference I asked some of the attendees what were their favorite sessions.

In the Systems track, the attendees loved “
Fundamentals of Stream Processing with Apache Beam” by Frances Perry and Tyler Akidau from Google. I’ve heard many attendees discuss how this presentation changed the way they think about stream processing applications. “Introducing Kafka Streams: Large-scale Stream Processing with Kafka” by Neha Narkhede was also incredibly popular, and many attendees are looking forward to the imminent release of Apache Kafka 0.10.0 which will include Kafka Streams.

In the Operations track, attendees enjoyed “101 Ways to Configure Kafka – Badly”, by Henning Spjelkavik & Audun Strand from, who shared all the mistakes they made as new Kafka users and how they corrected them. This presentation was a great mix of entertainment and education, and I’m sure no one who attended the session will end up with an 8-node ZooKeeper cluster.

In the Users track, attendees loved “Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData” by Anton Gorshkov from Goldman Sachs, who developed a stream processing application, live, including processing SMS messages sent by the audience in real time.

Video Recordings and Photos

Yes, we did record the sessions and they will be available in a week or so. I highly recommend checking them out. Links to the video recordings will be added to each of the session pages on Follow @ConfluentInc on Twitter and we’ll let you know as soon as they are ready. 

We’ll also post some photos from the conference soon on the Confluent Facebook page.


As it often happens at conferences, the sessions don’t tell the whole story. One of the highlights of the conference for me, was to interact and exchange ideas with the leaders of many different stream processing technologies. How often does it happen that leaders of Apache Storm, Apache Spark, Apache Flink, Apache Beam, and Apache Kafka get together to discuss abstractions, concepts, how to benchmark streams, and the best ways to educate an audience? Kafka Summit is, to the best to my knowledge. It’s the only conference where the community gets together and shares their vision.

The Confluent team is looking forward to hosting Kafka Summit again next year. If you weren’t able to make it last week, fill out the Stay-In-Touch form on the home page of and you’ll get updates about next year’s conference.

Thanks again to all that made it to Kafka Summit 2016 in San Francisco last week! The Confluent team enjoyed meeting everyone and we had a fantastic time!

Quick note on the next Apache Kafka release

A new release candidate for version 0.10.0 has been posted to the Apache Kafka mailing lists and a new vote was started. This release candidate actually contains two new features: Support for additional SASL authentication mechanisms (KIP-43) and a new API for clients to determine features supported by the brokers (KIP-35).

Subscribe to the Confluent Blog


More Articles Like This

Event Streaming Platform
Robin Moffatt

🚂 On Track with Apache Kafka – Building a Streaming ETL Solution with Rail Data

Robin Moffatt .

Trains are an excellent source of streaming data—their movements around the network are an unbounded series of events. Using this data, Apache Kafka® and Confluent Platform can provide the foundations ...

Apache Kafka + Kafka Connect + MQTT = IoT at Scale
Kai Waehner

Internet of Things (IoT) and Event Streaming at Scale with Apache Kafka and MQTT

Kai Waehner .

The Internet of Things (IoT) is getting more and more traction as valuable use cases come to light. A key challenge, however, is integrating devices and machines to process the ...

Kafka Summit SF 2019 Session Videos Available Now
Tim Berglund

Kafka Summit San Francisco 2019 Session Videos

Tim Berglund .

Last week, the Kafka Summit hosted nearly 2,000 people from 40 different countries and 595 companies—the largest Summit yet. By the numbers, we got to enjoy four keynote speakers, 56 ...

Fully managed Apache Kafka as a Service

Try Free

Nous utilisons des cookies afin de comprendre comment vous utilisez notre site et améliorer votre expérience. Cliquez ici pour en apprendre davantage ou pour modifier vos paramètres de cookies. En poursuivant la navigation, vous consentez à ce que nous utilisions des cookies.