Skip to Content
Data Engineering with Python
book

Data Engineering with Python

by Paul Crickard
October 2020
Beginner to intermediate
356 pages
6h 50m
English
Packt Publishing
Content preview from Data Engineering with Python

Chapter 12: Building a Kafka Cluster

In this chapter, you will move beyond batch processing – running queries on a complete set of data – and learn about the tools used in stream processing. In stream processing, the data may be infinite and incomplete at the time of a query. One of the leading tools in handling streaming data is Apache Kafka. Kafka is a tool that allows you to send data in real time to topics. These topics can be read by consumers who process the data. This chapter will teach you how to build a three-node Apache Kafka cluster. You will also learn how to create and send messages (produce) and read data from topics (consume).

In this chapter, we're going to cover the following main topics:

  • Creating ZooKeeper and Kafka clusters ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Analysis with Python and PySpark

Data Analysis with Python and PySpark

Jonathan Rioux
Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley
Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley

Publisher Resources

ISBN: 9781839214189Supplemental Content