August 2012
Intermediate to advanced
332 pages
7h 3m
English
HBase has an importtsv tool to support importing data from TSV files into HBase. Using this tool to load text data into HBase is very efficient, because it runs a MapReduce job to perform the importing. Even if you are going to load data from an existing RDBMS, you can dump data into a text file somehow and then use importtsv to import dumped data into HBase. This approach works well when importing a huge amount of data, as dumping data is much faster than executing SQL on RDBMS.
The importtsv tool does not only load data directly into an HBase table, it also supports generating HBase internal format (HFile) files, so that you can use the HBase bulk load tool to load generated files directly ...