Skip to Main Content
Hands-On Deep Learning with Apache Spark
book

Hands-On Deep Learning with Apache Spark

by Guglielmo Iozzia
January 2019
Intermediate to advanced content levelIntermediate to advanced
322 pages
7h 29m
English
Packt Publishing
Content preview from Hands-On Deep Learning with Apache Spark

Spark Streaming and Kafka

To use Spark Streaming with Kafka, you can do two things: either use a receiver or be direct. The first option is similar to streaming from other sources such as text files and sockets – data received from Kafka is stored in Spark executors and processed by jobs that are launched by a Spark Streaming context. This is not the best approach – it can cause data loss in the event of failures. This means that the direct approach (introduced in Spark 1.3) is better. Instead of using receivers to receive data, it periodically queries Kafka for the latest offsets in each topic and partitions, and accordingly defines, the offset ranges to process for each batch. When the jobs to process the data are executed, Kafka's simple ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Next-Generation Machine Learning with Spark: Covers XGBoost, LightGBM, Spark NLP, Distributed Deep Learning with Keras, and More

Next-Generation Machine Learning with Spark: Covers XGBoost, LightGBM, Spark NLP, Distributed Deep Learning with Keras, and More

Butch Quinto
Apache Spark Deep Learning Cookbook

Apache Spark Deep Learning Cookbook

Ahmed Sherif, Amrith Ravindra, Michal Malohlava, Adnan Masood

Publisher Resources

ISBN: 9781788994613Supplemental Content