Skip to Content
Data Engineering with Python
book

Data Engineering with Python

by Paul Crickard
October 2020
Beginner to intermediate
356 pages
6h 50m
English
Packt Publishing
Content preview from Data Engineering with Python

Chapter 13: Streaming Data with Apache Kafka

Apache Kafka opens up the world of real-time data streams. While there are fundamental differences in stream processing and batch processing, how you build data pipelines will be very similar. Understanding the differences between streaming data and batch processing will allow you to build data pipelines that take these differences into account.

In this chapter, we're going to cover the following main topics:

  • Understanding logs
  • Understanding how Kafka uses logs
  • Building data pipelines with Kafka and NiFi
  • Differentiating stream processing from batch processing
  • Producing and consuming with Python

Understanding logs

If you have written code, you may be familiar with software logs. Software developers ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Analysis with Python and PySpark

Data Analysis with Python and PySpark

Jonathan Rioux
Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley
Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley

Publisher Resources

ISBN: 9781839214189Supplemental Content