Chapter 6. Distributing recommendation computations
This chapter covers
- Analyzing a massive data set from Wikipedia
- Producing recommendations with Hadoop and distributed algorithms
- Pseudo-distributing existing nondistributed recommenders
This book has looked at increasingly large data sets: from 10s of preferences, to 100,000, to 10 million, and then 17 million. But this is still only medium-sized in the world of recommenders. This chapter ups the ante again by tackling a larger data set of 130 million preferences in the form of article-to-article links from Wikipedia’s massive corpus. In this data set, the articles are both the users and the items, which also demonstrates how recommenders can be usefully applied, with Mahout, to less ...