Shan et al.s [39] work on emotion recognition is in line with Efron’s analysis.
They focused on spatiotemporal aspects for modeling body gestures that allow for
recognizing emotional states. Instea d of defining specific spatiotemporal features
as Efron did, they analyzed video sequences without investing further knowledge
in the definition of specific features. They used spatial and temporal filters to iden-
tify regions and time-series that showed strong spatial or temporal activity.
Shan et al.s work was based on the general assumption that, although a strong
variance can be seen in how a gesture is made, spatiotemporal features related to
emotions are stable over subjects. Features were directly calculated on the video
image as points of interest in space-time, derived by employing spatial (Gaussian)
and temporal (Gabor) filters on the video image. To classify emotions, a clustering
approach was used to identify movement prototypes based on these interest points.
Recognition rates using support vector machines ranged between 59% and 83% for a
seven-class problem (anger, anxiety, boredom, di sgust, joy, puzzlement, surprise). To
train their recognition system they used a database containing around 1900 videos.
Additionally they showed that fusing infor mation from gestural activity and facial
expressions can result in higher recognition rates.
To sum up, a number of studies demonstrated the correlation between the qualita-
tive features of gestural activity, as described in Section 13.2, and emotional states.
However, they also showed that this correlation is not unambiguous and sometimes
allows us to derive only the intens ity of an emotion or its valence, not the distinct
emotion itself. Some first approaches to automatically recognizing emotions based
on such correlations were presented that are very promising but that, at the moment,
lack comparability because of the different sets of emotions and quite different data-
bases that were employed for training and testing the recognition techniques.
Whereas the analysis of emotional states has become very popular in recent years,
other contextual factors influencing interactions, such as personality or cultural
heuristics for behavior, have not been its central focus although, for instance,
Gallaher’s expressive parameters have been defined to capture the relation between
body movements and personality.
Ball and Breese [1] presented a first model for integrating personality as a factor
influencing gestural behavior. To this end, they defined a Bayesian network to model
the causal relations between gestural activity and posture and personality traits.
Their model was based on studies that show that people are able to reliably interpret
personality traits based on movement features. The appro ach used was primarily
concerned with conveying the personality of an embodied agent by characteristic
movements. However, because they mod eled this relation with a Bayesian network,
the same approach can be employed to recognize the user’s personality based on his
movement characteristics, which were already modeled in the network. Apart from
defining specific postures and gestures that are most likely to occur in correlation
336 CHAPTER 13 Nonsymbolic Gestural Interaction for Ambient Intelligence

Get Human-Centric Interfaces for Ambient Intelligence now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.