O'Reilly logo

Machine Learning in Java by Boštjan Kaluža

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Building a recommendation engine

To demonstrate both the content-based filtering and collaborative filtering approaches, we'll build a book-recommendation engine.

Book ratings dataset

In this chapter, we will work with book ratings dataset (Ziegler et al, 2005) collected in a four-week crawl. It contains data on 278,858 members of the Book-Crossing website and 1,157,112 ratings, both implicit and explicit, referring to 271,379 distinct ISBNs. User data is anonymized, but with demographic information. The dataset is available at:

http://www2.informatik.uni-freiburg.de/~cziegler/BX/.

The Book-Crossing dataset comprises three files described at their website as follows:

  • BX-Users: This contains the users. Note that user IDs (User-ID) have been anonymized ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required