Example 1 - RDBMS to Kafka

As we have created some data with earlier chapters for Sqoop export/import, we will reuse the same data for streaming them as events into Kafka using Flume in this example. Database Source Configuration:

  1. Copy the PostgresSQL driver jar, downloaded in previous chapters (Chapter 5, Data Acquisition of Batch Data with Apache Sqoop), into ${FLUME_HOME}/lib folder:
cp ${SQOOP_HOME}/lib/postgresql-9.4.1212.jre6.jar$F{LUME_HOME|/lib
  1. SQL as a source is not a standard source which gets bundled with flume distribution. Hence a third party source needs to be downloaded and installed:
    1. Download the source from the following location, using this command:
wget https://github.com/keedio/flume-ng-sql-source/archive/1.4.2.tar.gz ...

Get Data Lake for Enterprises now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.