In this chapter, we shall extend our exploration of verbal content generation for agents, presented in the previous chapter, to consider the way in which the prosody of an utterance is generated as a function of the dialog acts and social attitudes that are being expressed. This relates back to our fourth research question (defined in I.3.3) relating, in this case, to the means of modeling sequences of prosodic events in a social attitude generation system for an embodied conversation agent. Later in the chapter, we shall extend this question to a multimodal context, proposing an innovative model for generating co-verbal gestures based on verbal and prosodic content (section 5.3).
As in the previous chapters, the computational models presented here build on literature in the fields of conversational analysis and psycholinguistics, and on corpus analysis. In the case of the co-verbal gesture generation model, presented in section 5.3, only qualitative analysis was used. To generate the socio-emotional behaviors presented in section 5.1, we tested sequence mining methods as a means of reducing the effort needed to move from corpus analysis to a computational model.
5.1. Generating agent prosody
In the previous chapter, we presented an agent appreciation generation module based on verbal content generation patterns. The non-verbal content of agent utterances was based on parameters defined in Catherine Pelachaud’s Greta platform [OCH 13], ...