May 2017
Intermediate to advanced
348 pages
7h 8m
English
In this recipe, we will look at how to make sense of the data stored on HDFS and extract useful information out of the files like the number of occurrences of a string, a pattern, or estimations, and various benchmarks. For this purpose, we can use MapReduce, which is a computation framework that helps us answer many questions we might have about the data.
With Hadoop, we can process huge amount of data. However, to get an understanding of its working, we'll start with a simple program such as pi estimation or a word count example.
ResourceManager is the master for Yet another Resource Negotiator (YARN). The Namenode stores the file metadata and the actual blocks/data reside on the slave nodes called Datanodes. ...
Read now
Unlock full access