Project Metamorphosis : dévoilement de la plateforme de streaming d'événements nouvelle générationEn savoir plus

Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning without a Data Lake

Kai Waehner, Confluent

Machine Learning (ML) is separated into model training and model inference. ML frameworks typically use a data lake like HDFS or S3 to process historical data and train analytic models. But it’s possible to completely avoid such a data store, using a modern streaming architecture.

This talk compares a modern streaming architecture to traditional batch and big data alternatives and explains benefits like the simplified architecture, the ability of reprocessing events in the same order for training different models, and the possibility to build a scalable, mission-critical ML architecture for real time predictions with muss less headaches and problems.

The talk explains how this can be achieved leveraging Apache Kafka, Tiered Storage and TensorFlow.