Chapter 5. Connectors in Action

After having covered how Kafka Connect works and how to use it, it’s time to put your knowledge into practice! In this chapter we look at, and run, some of the most popular connectors: a sink connector for Amazon Simple Storage Service (S3), a JDBC source connector, and a MySQL source connector. We explain the use cases they target and their most important configurations, and demonstrate how to use them in various scenarios.

These three connectors address common use cases and appear in many pipelines across all industries, so having a good understanding of them is valuable. Even if you don’t use these specific connectors, we expect many of the topics covered to be applicable to others.

All of the examples assume that you have a Kafka cluster running with a bootstrap server accessible at localhost:9092.

Confluent S3 Sink Connector

One of the most common use cases of Kafka Connect is to export data from Kafka into a storage system. Often, you need to keep data long after it has been processed; it could be for legal reasons, for preserving historical data, or simply for batch-oriented systems that only run periodically. While Kafka can store data indefinitely, if you handle very large amounts of data it can become costly to store it in Kafka forever.

Cloud storage systems like Amazon S3 are designed for storing large amounts of data for long durations at a low cost per gigabyte. In addition, storage services can be used as data lakes due to their ...

Get Kafka Connect now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.