Skip to Content
Kafka Essentials LiveLessons: A Quick-Start for Building Effective Data Pipelines
on-demand course

Kafka Essentials LiveLessons: A Quick-Start for Building Effective Data Pipelines

with Douglas Eadline
July 2025
Intermediate
8h 27m
English
Pearson
Closed Captioning available in German, English, Spanish, French, Italian, Japanese

Overview

8+ Hours of Video Instruction

Learn how to manage high-performance data pipelines, streaming analytics, data integration, and mission-critical applications with Apache Kafka.

Apache Kafka is a popular message broker providing data flow management between producer application sources and consumer destinations. Kafka Essentials: A Quick-Start for Building Effective Data Pipelines covers many essential and practical aspects of using and running the Apache Kafka event streaming platform.

Learn How To

  • Write a Kafka Python application to produce and consume data
  • Use the Kafkaesque GUI
  • Use keys and multiple partitions with Kafka topics
  • Develop Python consumers that access data by index or time stamp
  • Save Kafka log data to external storage and databases
  • Use and configure Kafka Connect services
  • Conduct image streaming with Kafka
  • Size hardware for a Kafka cluster
  • Install Kafka and Zookeeper across multiple servers
  • Configure partitions across multiple brokers
  • Administer Kafka broker partition allocation, log file management, topic management, monitoring, and benchmarking.

About the Instructor

Doug Eadline is a practitioner and writer in the Linux cluster community and has documented many aspects of high-performance computing (HPC) and Hadoop/Spark computing. Currently, he is the editor of the HPCwire.com website and was previously the editor of ClusterWorld Magazine and a senior HPC Editor for Linux Magazine. Some of his popular video tutorials and books include Data Engineering Foundations LiveLessons Parts 1 and 2, Hadoop 2 Quick Start, High-Performance Computing for Dummies, and Practical Data Science with Hadoop and Spark.

Who Should Take This Course

  • You want to understand Apache Kafka and data streaming
  • You want to learn the basics of building data pipelines with Kafka sing Python
  • Hands-on experience with examples is important to you when learning a new technology
  • You want to continue exploring Kafka using a complete copy of the instructor's hands-on notes, example code, and free virtual machine.

Course Requirements

The course assumes familiarity with Python and the BASH command line on a modern Linux server. Python is used for all examples. BASH scripting is used to facilitate some examples and for installation and administration tasks.

Lesson Descriptions

Lesson 1: Kafka Background Concepts

In Lesson 1, Doug introduces Kafka by asking, "Why do I need a message broker?" Once answered, he explains the basic Kafka components, and then introduces the freely available Linux virtual machine that you will use to run many of the examples presented in the lessons. The lesson concludes with some basic examples of Kafka usage.

Lesson 2: Viewing Kafka Operations

Lesson 2 presents a Kafka graphical user interface, Kafkaesque. This interface lets you see inside Kafka topic logs, which will be used in many subsequent lessons. Doug uses Kafkaesque to review the basic examples from Lesson 1.

Lesson 3: Streaming NOAA Weather Data with Kafka Python

Lesson 3 provides a look at a simple Kafka Python application that produces (downloads) data from the NOAA weather site, and then consumes the data by loading it into a Pandas data frame. The examples are expanded to demonstrate the use of keys and multiple partitions with Kafka topics. The lesson concludes by developing Python consumers that access data by index or time stamp.

Lesson 4: Moving Kafka Topic Data to External Storage

Lesson 4 shows you how to save Kafka log data to external storage. Examples include PySpark streaming and Python consumers that write to MariaDB (MySQL) and Apache HBASE.

Lesson 5: Edge Image Streaming with Kafka Python

Lesson 5 demonstrates image streaming with Kafka. The example uses a Kafka Python producer to capture images from a 3D printer that are then examined by a Python consumer that performs real-time CNN analysis looking for defects. A simulated version is provided for the virtual machine.

Lesson 6: Data Pipelines and Kafka Connect

In Lesson 6, the Kafka Connector interface is introduced. Kafka connectors provide a quick method to use pre-written consumers and producers for many popular services. Doug demonstrates Kafka Connectors for text files, HDFS, and MariaDB (MySQL), along with connector management methods.

Lesson 7: Installation Considerations

In Lesson 7, Kafka broker installation is discussed. Topics include hardware choices, a recipe with scripts used for installing Kafka and Zookeeper across multiple servers, and configuring partitions across multiple brokers.

Lesson 8: Basic Administration Topics

Lesson 8 presents basic administration of Kafka brokers. Doug discusses various aspects of partition allocation and log file management. Coverage then moves to Kafka topic management, monitoring, and benchmarking of Kafka clusters.

All code, background, and links for this video can be found at: https://www.clustermonkey.net/download/LiveLessons/Kafka_Essentials/

About Pearson Video Training

Pearson publishes expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. These professional and personal technology videos feature world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, Pearson IT Certification, Sams, and Que. Topics include: IT Certification, Network Security, Cisco Technology, Programming, Web Development, Mobile Development, and more. Learn more about Pearson Video training at http://www.informit.com/video.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Watch now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Data Engineering Foundations Part 2: Building Data Pipelines with Kafka and Nifi

Data Engineering Foundations Part 2: Building Data Pipelines with Kafka and Nifi

Douglas Eadline

Publisher Resources

ISBN: 9780138176761