To demonstrate both the content-based filtering and collaborative filtering approaches, we'll build a book-recommendation engine.
In this chapter, we will work with book ratings dataset (Ziegler et al, 2005) collected in a four-week crawl. It contains data on 278,858 members of the Book-Crossing website and 1,157,112 ratings, both implicit and explicit, referring to 271,379 distinct ISBNs. User data is anonymized, but with demographic information. The dataset is available at:
The Book-Crossing dataset comprises three files described at their website as follows: