O'Reilly logo

Hadoop MapReduce v2 Cookbook - Second Edition by Thilina Gunarathne

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Calculating Scatter plots using MapReduce

Another useful tool while analyzing data is a Scatter plot, which can be used to find the relationship between two measurements (dimensions). It plots the two dimensions against each other.

For example, this recipe analyzes the data to find the relationship between the size of the web pages and the number of hits received by the web page.

The following image shows the execution summary of this computation. Here, the map function calculates and emits the message size (rounded to 1024 bytes) as the key and one as the value. Then, the Reducer calculates the number of occurrences for each message size:

Calculating Scatter plots using MapReduce

Getting ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required