O'Reilly logo

Apache Spark Graph Processing by Rindra Ramamonjison

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Applications – music fan community detection

We are now ready to apply the previous graph clustering method to the cluster music songs, according to the tags attached to each song. Alternatively, a dataset of the song playlists can also be used to cluster songs that are often played in many lists. The datasets that we are going to work with can be downloaded from http://www.cs.cornell.edu/~shuochen/lme/data_page.html. The datasets consist of the following files:

  • train.txt: This file contains the playlist data by using the integer ID to represent songs
  • tags.txt: This file includes the social tags by using the integer ID to represent songs
  • song_hash.txt: This file maps a song ID to its title and artist
  • tag_hash.txt: This one maps a tag ID to its name ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required