
the most relevant features and train a classifier with the resulting
feature set. Hence, fusion is based on the integration of low-level
features at the feature level (see Figure 1b) and takes place at a rather
early stage of the recognition process.
An alternative would be to fuse the recognition results at the
decision level based on the outputs of separate unimodal classifiers
(see Figure 1c). Here, multiple unimodal classifiers are trained for
each modality individually and the resulting decisions are fused by
using specific weighting rules. In the case of emotion recognition, the
input for the fusion algorithm may consist of either discrete ...