O'Reilly logo
live online training icon Live Online training

Kafka Fundamentals

A hands-on course in mastering Kafka at scale

Topic: Data
Petter Graff

Apache Kafka is an increasingly popular foundation for large-scale software systems. In this course, you’ll learn how to use Kafka to publish and subscribe to data streams, and how Kafka can be used to solve various use cases. You’ll also learn how to install and configure a Kafka cluster, and how to use the Kafka API’s to produce and consume data. We’ll also discuss how to connect Kafka to technologies for stream processing, log aggregation, and other related big-data technologies.

What you'll learn-and how you can apply it

  • Why Kafka is scalable
  • How to interact with Kafka
  • Kafka’s role in enterprise architectures
  • How to design Kafka topics and partitions

Participants will be able to:

  • Install and configure Kafka
  • Publish data to Kafka
  • Subscribe to data from Kafka
  • Design Kafka topics and partitions

This training course is for you because...

  • You are a software architect with experience building enterprise systems, and you need to ensure that your systems are scalable and fault tolerant
  • You are a software developer with Java experience, and you need to build software on top of Kafka


  • Basic knowledge of Java

A GitHub link to a description for installations will be provided.

Recommended Preparation:

Docker: Up & Running, Chapter 3. Installing Docker

Introduction to Apache Kafka

About your instructor

  • Petter Graff is a partner at Northscaler, helping Fortune-500 companies reach their potential through training, consulting and custom development. Petter has extensive experience building large scale software systems for many of the Fortune-500 companies. Petter was the main architect behind the open source project Yaktor (yaktor.io) which relies on Kafka to deliver messages across large clusters of computation nodes. Yaktor and Kafka has been used by various companies to build systems processing millions of messages per second. Petter is also a frequent speaker at various conferences and an O’Reilly author (check out his Video Series on Design Patterns in Java).


The timeframes are only estimates and may vary according to how the class is progressing


Introduction (Lecture ~ 20 min)

  • Who are we
  • What is Kafka
  • Explain first lab

Verify that everything is installed and working (Lab ~ 20 min)

  • Install Kafka through Docker
  • Run a simple example of Kafka

Introduction to Kafka (Lecture ~ 30 min)

  • Kafka under the hood
  • What is a topic
  • What is a partition
  • What is a producer
  • What is a consumer

Creating a topic and pass a message (Lab ~30 min)

  • Create a topic
  • Run a simple consumer
  • Run a simple producer

Dissecting the first example (Lecture/Discussion ~ 30 min)

  • Walkthrough of the first lab
  • Question and answers

Design of Kafka topics and partitions (Lecture ~ 30 min)

  • Case study
  • How to select topics
  • How to select partitions

Exercise: Designing topics and partitions (Group Project ~ 20 min) - Design topics and partitions


Evaluation of the designs and suggested solutions (Discussion ~20 min)

  • Discussion of the suggested solution(s)
  • Recommended design of case study

Implement Topics and Partitions for case study (Lab ~30 min)

  • Define a topic and partition in Kafka
  • Create a consumer and producer
  • Run a test script

Scaling Kafka (Lecture ~30 min)

  • Kafka Brokers
  • Kafka Clusters
  • Cluster mirroring
  • Consumer groups

Streaming APIs for Kafka (Lecture ~20 min)

  • What is streaming
  • Why use streams
  • Programming to streams
  • Example streams using Spark

Streaming and IoT Case Study (Lab ~30 min)

  • Consume a stream from Kafka
  • Build a Spark application over the Kafka stream

Kafka Administration and Integration (Lecture ~30 Min)

  • Integration with Big Data tools (Storm, Spark, Hadoop)
  • Kafka Connect
  • Certified Kafka connectors
  • Kafka administration
  • Kafka monitoring
  • Security