Analytics with Hadoop

Analytics is the process of finding significance in data, meaning that it can support decision making. Decision makers turning to Hadoop’s data for their answers will find numerous analytics options. For example, Hadoop-based databases like Apache Hive and Cloudera Impala offer SQL-like interfaces with HDFS-based data. For in-memory data processing, Apache Spark is available at a processing rate that is an order faster than Hadoop. For those who have had experience with relational databases, these SQL-like languages can be a simple path into analytics on Hadoop.

In this chapter, I will explain the building blocks ...

Get Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.