A first look at the data
The dataset is composed of three tables—one for users, one for books, and one for ratings. The BX-Users table contains the users' data. The User-ID is a sequential integer value, as the original user ID has been anonymized. The Location and Age columns contain the corresponding demographic information. This is not available for all the users and in these cases, we'll encounter the NULL value (as the NULL string).
The BX-Books table stores the information about the books. For the unique identifier, we have the standard ISBN book code. Besides this, we are also provided with the book's title (the Book-Title column), author (Book-Author), publishing year (Year-of-Publication), and the publisher (Publisher). URLs of thumbnail ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access