Getting data into Hadoop
Now that we have put in all that up-front effort, let us look at ways of bringing the data out of MySQL and into Hadoop.
Using MySQL tools and manual import
The simplest way to export data into Hadoop is to use existing command-line tools and statements. To export an entire table (or indeed an entire database), MySQL offers the
mysqldump
utility. To do a more precise export, we can use a SELECT
statement of the following form:
SELECT col1, col2 from table INTO OUTFILE '/tmp/out.csv' FIELDS TERMINATED by ',', LINES TERMINATED BY '\n';
Once we have an export file, we can move it into HDFS using hadoop fs -put
or into Hive through the methods discussed in the previous chapter.
Have a go hero – exporting the employee table into ...
Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.