Chapter 6. Distributing recommendation computations

This chapter covers

Analyzing a massive data set from Wikipedia
Producing recommendations with Hadoop and distributed algorithms
Pseudo-distributing existing nondistributed recommenders

This book has looked at increasingly large data sets: from 10s of preferences, to 100,000, to 10 million, and then 17 million. But this is still only medium-sized in the world of recommenders. This chapter ups the ante again by tackling a larger data set of 130 million preferences in the form of article-to-article links from Wikipedia’s massive corpus.^[1] In this data set, the articles are both the users and the items, which also demonstrates how recommenders can be usefully applied, with Mahout, to less ...

Get Mahout in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Mahout in Action by Sean Owen, B. Ellen Friedman, Robin Anil, Ted Dunning

Chapter 6. Distributing recommendation computations

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly