December 2016
Beginner to intermediate
256 pages
7h 26m
English
You can have data without information, but you cannot have information without data.
Daniel Keys Moran
In This Chapter:
The data lake concept is presented as a new data processing paradigm.
Basic methods for importing CSV data into HDFS and Hive tables are presented.
Additional methods for using Spark to import data into Hive tables or directly for a Spark job are presented.
Apache Sqoop is introduced as a tool for ...
Read now
Unlock full access