September 2017
Beginner to intermediate
412 pages
8h 55m
English
This chapter presents some of the ideas and algorithms involved in the analysis of very large datasets. The two main algorithms are Google's PageRank algorithm and the MapReduce framework.
To illustrate how MapReduce works, we have implemented the WordCount example, which counts the frequencies of the words in a collection of text files. The more realistic implementation would be with MongoDB, which is presented in Chapter 10, NoSQL Databases, or in Apache Hadoop, which is briefly described in this chapter.
Read now
Unlock full access