Saunder January 24, 2011 10:39 book
208 Music Emotion Recognition
represents a good example showing how the dimensional conceptualization of
emotion and the techniques developed for dimensional MER can be applied to
music retrieval and management in the 2D emotion space.
13.2 2D Visualization of Music
Visualizing music in 2D space has been studied in the past. As conventional ap-
proaches that manage music pieces based on catalog metadata such as artist name,
album name, and song name cannot provide enough information on music simi-
larity, which the users highly expect from a music search tool , systems that
help users to retrieve and browse music pieces in a content-based fashion have been
developed. For example, an approach to visualizing a music collection based on
two perceptual attributes, rhythm and timbre, has been developed in . Other
systems that present music pieces in 2D space include the PocketSOM  and
the Islands of Music , which utilize the self-organizing map  to map the
high-dimensional music features to a 2D map grid while preserving similarity re-
lationships in the feature space. However, there is no semantic meaning associated
with the resulting two dimensions.
As some sort of emotional experience is probably the main reason behind most
people’s engagement with music, a plausible way of visualizing music pieces is pre-
senting them in the emotion plane. With Mr. Emo, one can easily retrieve music
pieces of a certain emotion without knowing the titles or browse a personal collection
in the emotion plane. One can also couple emotion-based retrieval with traditional
keyword- or artist-based ones, to retrieve songs similar (in the sense of perceived
emotion) to a favorite piece or to select the songs of an artist according to emotion.
In addition, it is also possible to play back music that matches a user’s current emo-
tion state, which can be estimated from facial or prosodic cues [22,192,253]. Such
a simple 2D user interface for content-based retrieval of music is particularly useful
for tiny mobile devices such as MP3 players or cell phones that have a small display
space and a limited input capability.
13.3 Retrieval Methods
In Mr. Emo, the critical task of predicting the VA values is accomplished by the
regression techniques introduced in Chapter 4. Given the regression models, the
valence and arousal (VA) values of an input song can be automatically computed
without manual labeling. Associated with the VA values, eachmusic piece is visualized
as a point in the emotion plane, and the similarity between music pieces is measured
by Euclidean distance. Many novel retrieval methods can be realized in the emotion
plane, making music information access much easier and more effective. Below we
describe four example music retrieval and organization methods that are performed
in the emotion plane.