O'Reilly logo

HDInsight Essentials - Second Edition by Rajesh Nadipalli

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Using Sqoop to move data from RDBMS to Data Lake

Sqoop enables us to transfer data between any relational database and Hadoop. You can import data from any relational database that has a JDBC adaptor such as SQL Server, MySQL, Oracle, Teradata, and others, to HDInsight.

Key benefits

The major benefits of using Sqoop to move data are as follows:

  • Leverages RDBMS metadata to get the column data types
  • It is simple to script and uses SQL
  • It can be used to handle change data capture by importing daily transactional data to HDInsight
  • It uses MapReduce for export and import that enables parallel and efficient data movement

Two modes of using Sqoop

Sqoop can be used to get data into and out of Hadoop; it has two modes of operation:

  • Sqoop import: Data moves from ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required