Applications – music fan community detection
We are now ready to apply the previous graph clustering method to the cluster music songs, according to the tags attached to each song. Alternatively, a dataset of the song playlists can also be used to cluster songs that are often played in many lists. The datasets that we are going to work with can be downloaded from http://www.cs.cornell.edu/~shuochen/lme/data_page.html. The datasets consist of the following files:
train.txt
: This file contains the playlist data by using the integer ID to represent songstags.txt
: This file includes the social tags by using the integer ID to represent songssong_hash.txt
: This file maps a song ID to its title and artisttag_hash.txt
: This one maps a tag ID to its name ...
Get Apache Spark Graph Processing now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.