Chapter 3. Recommending Music and the Audioscrobbler Data Set
De gustibus non est disputandum.
(There’s no accounting for taste.)
When somebody asks what it is I do for a living, the direct answer of “data science” or “machine learning” sounds impressive but usually draws a blank stare. Fair enough; even actual data scientists seem to struggle to define what these mean—storing lots of data, computing, predicting something? Inevitably, I jump straight to a relatable example: “OK, you know how Amazon will tell you about books like the ones you bought? Yes? Yes! It’s like that.”
Empirically, the recommender engine seems to be an example of large-scale machine learning that everyone understands, and most people have seen Amazon’s. It is a common denominator because recommender engines are everywhere, from social networks to video sites to online retailers. We can also directly observe them in action. We’re aware that a computer is picking tracks to play on Spotify, in much the same way we don’t necessarily notice that Gmail is deciding whether inbound email is spam.
The output of a recommender is more intuitively understandable than other machine learning algorithms. It’s exciting, even. For as much as we think that musical taste is personal and inexplicable, recommenders do a surprisingly good job of identifying tracks we didn’t know we would like.
Finally, for domains like music or movies where recommenders are usually deployed, it’s comparatively easy to reason ...