O'Reilly logo

Hadoop in Practice, Second Edition by Alex Holmes

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 7. Utilizing data structures and algorithms at scale

This chapter covers

  • Representing and using data structures such as graphs, HyperLogLog, and Bloom filters in MapReduce
  • Applying algorithms such as PageRank and semi-joins to large amounts of data
  • Learning how social network companies recommend making connections with people outside your network

In this chapter we’ll look at how you can implement algorithms in MapReduce to work with internet-scale data. We’ll focus on nontrivial data, which is commonly represented using graphs.

We’ll also look at how you can use graphs to model connections between entities, such as relationships in a social network. We’ll run through a number of useful algorithms that can be performed over graphs, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required