Project Metamorphosis: Unveiling the next-gen event streaming platform.Learn More
Déploiement d'Apache Kafka

290 Reasons to Upgrade to Apache Kafka

When we released Apache Kafka, we talked about all of the big new features we added: the new consumer, Kafka Connect, security features, and much more. What we didn’t talk about was something even more important, something that we had spent even more of our time on — correctness, bug fixes, and operability. These are always more important than new features.

According to Apache JIRA, 290 bugs have been fixed for release and some of them are quite important. Even more exciting is the fact that while working on, we added a brand new distributed testing framework and over 100 new test scenarios that use this framework. We are now testing replication, node failures, controller failures, MirrorMaker, rolling upgrades, security scenarios, Kafka connect failures, and much more. This allowed us to not only catch many issues for this release but will give us the confidence that we are maintaining the high quality that Kafka is known for in the future.

Here are some of the more noteworthy bugs we caught and fixes for Apache Kafka

    1. Replication is the backbone of Kafka. When replication goes wrong, bad things happen. In Kafka we fixed varied replication issues. For example, we found and fixed an obscure race condition where if a machine ever gets slow enough that context switching between threads is slower than a remote call, it is possible for a broker to think it has fallen out of sync and as a result delete all its data (KAFKA-2477), min.insync.replica default configuration not working as expected (KAFKA-2114) and replication lag being impossible to configure (KAFKA-1546).
    2. MirrorMaker is Kafka’s cross-cluster replication tool. In the 0.8 release line, MirrorMaker buffered messages between the consumers reading from source cluster and the producers writing to the destination. Consumed offsets were stored using a separate thread (marking messages as “done”). When MirrorMaker process crashed, in some cases messages in the buffer were considered “done” even though they were never written to the target cluster, thereby losing these messages. Kafka includes a newly refactored MirrorMaker with a simpler design that prevents message loss by making sure message offsets are stored only when we are certain the messages were written safely to the target cluster. (KAFKA-1997).
    3. Kafka application logs can be too chatty at INFO level but too quiet at WARN level. This makes it difficult to troubleshoot issues and sometimes causes false alarms. In Kafka we cleaned up the logs, making them more managable (See: KAFKA-2504, KAFKA-2288, KAFKA-2251, KAFKA-2522,  KAFKA-1461)
    4. Log Compaction is one of the most exciting Kafka features, enabling a variety of new use-cases. Unfortunately, it also had some nasty bugs, so many users opted out even for use cases where compaction was a natural fit. For we fixed a large number of log compaction bugs and limitations. The biggest improvement is the ability to compact topics with compressed messages (KAFKA-1374), but there was a very large number of additional improvements (KAFKA-2235. KAFKA-2163, KAFKA-2118, KAFKA-2660, KAFKA-1755).
    5. Connection leak can be an issue in a shared environments when applications connecting to Kafka can’t be relied on to properly close their connections. Kafka includes two patches that make the server much more efficient at detecting and cleaning dead connections (KAFKA-1282, KAFKA-2096).
    6. Kafka broker metadata includes a list of leaders and replicas for each partition. This metadata is stored in ZooKeeper and is also cached in memory of each broker. release includes multiple bug fixes for cases where the metadata cache falls out of sync (KAFKA-1867, KAFKA-1367, KAFKA-2722, KAFKA-972).
    7. The Request Purgatory, where client requests wait until they can be responded to, underwent a complete re-write into a far more efficient data structure in In the process we also fixed a bug where the purgatory was growing out of control (KAFKA-2147).
    8. Producer timeouts for the new producer were not strictly enforced in 0.8.2, so some operations would block for much longer than specified timeout. In the tracking of timeouts was improved and timeouts are now consistent and work as expected (KAFKA-2120).

I expect some of these issues may ring an alarm bell, maybe even a loud and annoying bell, in which case the reason to upgrade to Kafka should be clear. Even if you have not come across any of these issues yet, you don’t know when you will. It is much better to have time to plan for an upgrade, rather than have to upgrade under pressure because your production system just hit a bug that was fixed 8 months ago. 

To make things easier, Apache Kafka can be upgraded with no downtime by using rolling upgrades. Check our documentation to learn the exact process and start planning your upgrade.

Did you like this blog post? Share it now

Subscribe to the Confluent blog

More Articles Like This

Best Practices to Secure Your Apache Kafka Deployment

For many organizations, Apache Kafka® is the backbone and source of truth for data systems across the enterprise. Protecting your event streaming platform is critical for data security and often […]

Highly Available, Fault-Tolerant Pull Queries in ksqlDB

One of the most critical aspects of any scale-out database is its availability to serve queries during partial failures. Business-critical applications require some measure of resilience to be able to […]

Walmart’s Real-Time Inventory System Powered by Apache Kafka

Consumer shopping patterns have changed drastically in the last few years. Shopping in a physical store is no longer the only way. Retail shopping experiences have evolved to include multiple […]

Sign Up Now

Recevez jusqu'à 50 $ US de réduction sur votre facture chaque mois calendaire pour le premier trimestre.

Nouvelles inscriptions uniquement.

By clicking “sign up” above you understand we will process your personal information in accordance with our Politique de confidentialité.

En cliquant sur « Inscription » ci-dessus, vous acceptez les termes du/de la Conditions d'utilisation et de recevoir occasionnellement des e-mails publicitaires de la part de Confluent. Vous comprenez également que nous traiterons vos informations personnelles conformément à notre Politique de confidentialité.

Gratuit à vie sur un seul broker Kafka

Le logiciel permettra une utilisation illimitée dans le temps de fonctionnalités commerciales sur un seul broker Kafka. Après l'ajout d'un second broker, un compteur de 30 jours démarrera automatiquement sur les fonctionnalités commerciales. Celui-ci ne pourra pas être réinitialisé en revenant à un seul broker.

Sélectionnez un type de déploiement
Manual Deployment
  • tar
  • zip
  • deb
  • rpm
  • docker
Déploiement automatique
  • kubernetes
  • ansible

By clicking "download free" above you understand we will process your personal information in accordance with our Politique de confidentialité.

En cliquant sur « Téléchargement gratuit » ci-dessus, vous acceptez la Contrat de licence Confluent et de recevoir occasionnellement des e-mails publicitaires de la part de Confluent. Vous acceptez également que vos renseignements personnels soient traitées conformément à notre Politique de confidentialité.

Ce site Web utilise des cookies afin d'améliorer l'expérience utilisateur et analyser les performances et le trafic sur notre site Web. Nous partageons également des informations concernant votre utilisation de notre site avec nos partenaires publicitaires, analytiques et de réseaux sociaux.