July 2015
Intermediate to advanced
480 pages
13h 43m
English
You can have data without information, but you cannot have information without data.
—Daniel Keys Moran
A significant percentage of the effort put into Hadoop has to do with loading and unloading data from the cluster. A number of ELT tools and third-party tools come with the Hadoop distribution (see Figure 5.1). However, the focus here is on the Hadoop tools. They include the following:
Flume (streaming)
Sqoop (SQL data sources)
WebHDFS (REST APIs)
HDFS NFS
Figure 5.1 ...
Read now
Unlock full access