Currently one of the hottest projects across the Hadoop ecosystem, Apache Kafka is a distributed, real-time data system that functions in a manner similar to a pub/sub messaging service, but with better throughput, built-in partitioning, replication, and fault tolerance. In this video course, host Gwen Shapira from Cloudera shows developers and administrators how to integrate Kafka into a data processing pipeline.
You’ll start with Kafka basics, walk through code examples of Kafka producers and consumers, and then learn how to integrate Kafka with Hadoop. By the end of this course, you’ll be ready to use this service for large-scale log collection and stream processing.
- Learn Kafka’s use cases and the problems that it solves
- Understand the basics, including logs, partitions, replicas, consumers, and producers
- Set up a Kafka cluster, starting with a single node before adding more
- Write producers and consumers, using old and new APIs
- Use the Flume log aggregation framework to integrate Kafka with Hadoop
- Configure Kafka for availability and consistency, and learn how to troubleshoot various issues
- Become familiar with the entire Kafka ecosystem
Gwen Shapira is a software engineer at Cloudera with 15 years of experience working with customers to design scalable data architectures. Working as a data warehouse DBA, ETL developer, and a senior consultant, Gwen specializes in building scalable data processing pipelines and integrating existing data systems with Hadoop. She’s a committer to Apache Sqoop and an active contributor to Apache Kafka.
Table of contents
- The Case for Kafka 00:11:23
- The Basics 00:09:10
- Setting up a Kafka Cluster 00:15:30
- Writing a Kafka Producer 00:14:33
- Writing a Kafka Consumer 00:16:34
- Using Kafka from Python 00:08:03
- Troubleshooting Kafka 00:29:29
- Integrating Kafka and Hadoop with Flafka 00:26:06
- Kafka Availability and Consistency 00:22:38
- Kafka Ecosystem 00:13:13
- Future of Kafka 00:08:53
- Title: Introduction to Apache Kafka
- Release date: March 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491923306
You might also like
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
Software developers and architects increasingly turn to microservices as a framework for improving the agility and …
Apache Kafka Series - Learn Apache Kafka for Beginners
Tutorial: Learn the Apache Kafka ecosystem, core concepts, operations, Kafka API, and build your own producers …
Algorithms: 24-part Lecture Series
Algorithms, Deluxe Edition, Fourth Edition These Algorithms Video Lectures cover the essential information that every serious …