O'Reilly logo

Music Emotion Recognition by Homer H. Chen, Yi-Hsuan Yang

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Saunder January 24, 2011 10:39 book
4
Dimensional MER
by Regression
This chapter introduces the regression approach to MER, which is one of the earliest
and most widely adopted techniques for dimensional MER. The regression approach
is also applicable to music emotion variation detection (MEVD) and categorical
MER. In fact, it is the root of many other computational models for dimensional
MER, including those described in the subsequent chapters. Therefore, a thorough
understanding of the regression approach is a prerequisite to the study of advanced
dimensional MER systems.
4.1 Adopting the Dimensional Conceptualization
of Emotion
One may categorize emotions into a number of classes and train a classifier (could be a
standard pattern recognition procedure) to learn the relationship between music and
emotion. A straightforward and perhaps the simplest classification of emotion adopts
the basic emotions (e.g., happy, angry, sad, and fear) as the emotion classes [90,204].
On the other hand, the emotion classes can be defined in terms of valence (how
positive or negative) and arousal (how exciting or calming) [217, 336, 342, 352,
365]. For example, in the classification shown in Figure 4.1, the emotion classes are
classified as the four quadrants in the emotion plane [272,310].
However, even with the emotion plane as a convenient way to visualize the
emotion classification, the categorical taxonomy of emotion classes is still inherently
ambiguous. Each emotion class represents an area in the emotion plane, and the
emotion states within each area may vary a lot. For example, the first quadrant of
55
Saunder January 24, 2011 10:39 book
56 Music Emotion Recognition
Annoying
Angry
Nervous
Pleasing
Happy
Exciting
Sad
(Negative)
Boring
Sleepy
(Low)
Calm
Peaceful
Relaxing
(Positive)
Valence
Arousal (High)
21
34
Figure 4.1 The 2D valence-arousal emotion plane. (Data from J. A. Russell. J.
Personality & Social Pychology. 39(6): 1161–1178. 1980 and R. E. Thayer. The
Biopsychology of Mood and Arousal, Oxford University Press, New York, 1989)
the emotion plane contains emotions such as exciting, happy, and pleasing, which
are different in nature. More importantly, as we have discussed in Section 1.3.1, this
categorical approach faces a granularity issue that the number of emotion classes is
too small in comparison with the richness of emotion perceived by humans. Using a
finer granularity for emotion description does not necessarily address the issue since
language is ambiguous, and the description for the same emotion varies from person
to person [159].
Unlike other approaches, the regression approach to MER developed in [364]
and [163] adopts the dimensional conceptualization of emotion (cf. Section 2.1.2)
and views the emotion plane as a continuous space. Each point of the plane is
considered an emotion state. In this way, the ambiguity associated with the emotion
classes or the affective terms is avoided since no categorical class is needed. The
regression approach is also free of the granularity issue, since the emotion plane
implicitly offers an infinite number of emotion descriptions.
The regression approach applies a computational model that predicts the valence
and arousal (VA) values of a music piece, which determine the placement of the mu-
sic piece in the emotion plane. The placement of a music piece in the emotion plane
directly indicates the affective content of the music piece. A user can then retrieve
music by specifying a point in the emotion plane according to his/her emotion state,
and the system would return the music pieces whose locations are closest to the speci-
fied point. Because the 2D emotion plane provides a simple means for user interface,
novel emotion-based music organization, browsing, and retrieval can be easily cre-
ated for mobile devices. Such a user interface is of great use in managing large-scale
music databases. Chapter 13 has more details about this aspect of the approach.
Clearly, the viability of the regression approach to MER heavily relies on the
accuracy of predicting the valence and arousal values, or VA prediction.Asits name

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required