The Hadoop ecosystem client
So far, we discussed that HBase clients which work in the interactive mode are synchronous in nature. For batch processing that runs background work such as building search indexes, building statistical data for reporting needs, and so on, a Hadoop ecosystem client such as Hive is used.
The Hadoop MapReduce framework is used to process a large scale of data. For these MapReduce jobs, Hbase can be used in variety of ways such as data source or target or both. This section does not talk about MapReduce usage as it is already covered in the previous chapter.
Hive is a data warehouse infrastructure built on top of Hadoop. Hive provides a SQL-like query language called HiveQL that allows querying the semi-structured ...