Apache Kafka Series - Kafka Connect Hands-on Learning

Video description

A comprehensive and new course for learning the Apache Kafka Connect framework with hands-on Training.Kafka Connect is a tool for scalable and reliable streaming data between Apache Kafka and other data systems. Apache Kafka Connect is a common framework for Apache Kafka producers and consumers.Apache Kafka Connect offers an API, runtime, and REST service to enable developers to define connectors that move large data sets into and out of Apache Kafka in real time. It inherits strong concepts such as fault-tolerance and elasticity thanks to being an extension of Apache Kafka. Kafka Connect can ingest entire databases, collect metrics, and gather logs from all your application servers into Apache Kafka topics, making the data available for stream processing with low latency. Kafka Connect standardises the integration of other data systems with Apache Kafka, simplifying connector development, deployment, and management. In this course, we are going to learn Kafka connector deployment, configuration, and management with hands-on exercises. We are also going to see the distributed and standalone modes to scale up to a large, centrally-managed service supporting an entire organisation or scale down to development, testing, and small production deployments. The REST interface is used to submit and manage connectors to your Kafka Connect cluster via easy to use REST API’s.

What You Will Learn

  • Configure and run Apache Kafka source and sink connectors
  • Learn concepts behind Kafka Connect and the Kafka Connect architecture
  • Launch a Kafka Connect cluster using Docker Compose
  • Deploy Kafka connectors in standalone and distributed modes
  • Write your own Kafka connector

Audience

Developers who wants to learn the Apache Kafka Connect Framework and get hands-on with it. Professionals who are good at the Apache Kafka ecosystem and basic core concepts. Architects who want to understand how Kafka Connect fits into their solution architecture.

About The Author

Stéphane Maarek: Stéphane Maarek is a solutions architect, consultant, and software developer who has a particular interest in all things related to big data and analytics. He is also a bestseller instructor on Udemy for his courses on Apache Kafka, Apache NiFi, and AWS Lambda. He loves Apache Kafka and regularly contributes to the Apache Kafka project.

Stéphane has also written a guest blog post that was featured on the Confluent website, the company behind Apache Kafka. He is also an AWS Certified Solutions Architect and has many years of experience with technologies such as Apache Kafka, Apache NiFi, Apache Spark, Hadoop, PostgreSQL, Tableau, Spotfire, Docker, Ansible, and more.

Table of contents

  1. Chapter 1 : Course Introduction
    1. Important Pre-Requisites
    2. Course Objectives
    3. Course Structure
    4. About Your Instructor
  2. Chapter 2 : Kafka Connect Concepts
    1. What is Kafka Connect?
    2. Kafka Connect Architecture Design
    3. Connectors, Configuration, Tasks, Workers
    4. Standalone vs Distributed Mode
    5. Distributed Architecture in Details
  3. Chapter 3 : Setup and Launch Kafka Connect Cluster
    1. Docker on Mac (recent versions)
    2. Docker Toolbox on Mac (older versions)
    3. Docker on Linux (Ubuntu as an example)
    4. Docker on Windows 10 64bit
    5. Docker Toolbox on Windows (older versions)
    6. Starting Kafka Connect Cluster using Docker Compose
  4. Chapter 4 : Troubleshooting Kafka Connect
    1. It's not working! What to do?
    2. Where to view logs?
  5. Chapter 5 : Kafka Connect Source - Hands On
    1. Kafka Connect Source Architecture Design
    2. FileStream Source Connector - Standalone Mode - Part 1
    3. FileStream Source Connector - Standalone Mode - Part 2
    4. FileStream Source Connector - Distributed Mode
    5. List of Available Connectors
    6. Twitter Source Connector - Distributed Mode - Part 1
    7. Twitter Source Connector - Distributed Mode - Part 2
    8. Section Summary
  6. Chapter 6 : Kafka Connect Sink - Hands On
    1. Kafka Connect Sink Architecture Design
    2. ElasticSearch Sink Connector - Distributed Mode - Part 1
    3. ElasticSearch Sink Connector - Distributed Mode - Part 2
    4. Kafka Connect REST API
    5. JDBC Sink Connector - Distributed Mode
  7. Chapter 7 : Writing your own Kafka Connector
    1. Goal of the section: GitHubSourceConnector
    2. Finding the code and installing required software
    3. Description of the GitHub Issues API
    4. Using the Maven Archetype to get started
    5. Config Definitions
    6. Connector Class
    7. Writing a schema
    8. Data Model for our Objects
    9. Writing our GitHub API HTTP Client
    10. Source Partition Source Offsets
    11. Source Task
    12. Building and running a Connector in Standalone Mode
    13. Deploying our Connector on the Landoop cluster
    14. More Resources for Developers
  8. Chapter 8 : Advanced Concepts
    1. Setting up Kafka Connect in Production (1/2)
    2. Setting up Kafka Connect in Production (2/2)
    3. What's next?

Product information

  • Title: Apache Kafka Series - Kafka Connect Hands-on Learning
  • Author(s): Stéphane Maarek
  • Release date: May 2018
  • Publisher(s): Packt Publishing
  • ISBN: 9781789344738