Importing data from Kafka into HDFS using Flume

Kafka is one the most popular message queue systems being used these days. We can listen to Kafka topics and put the message data directly into HDFS using Flume. The latest Flume version supports importing data from Kafka easily. In this recipe, we are going to learn how to import Kafka messages to HDFS.

Getting ready

To perform this recipe, you should have a Hadoop cluster running with you as well as the latest version of Flume installed on it. Here I am using Flume 1.6. We also need Kafka installed and running on one of the machines. I am using kafka_2.10-0.9.0.0.

How to do it...

  1. To import the data from Kafka, first you need to have Kafka running on your machine. The following command starts Kafka ...

Get Hadoop: Data Processing and Modelling now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.