success. Following this, an important performance factor is the ability of the dia-
logue manager to cope with errors. This can be measured by the amount of
repaired errors and the efficiency with which repair takes place (e.g., number
of correction turns per error). On a less pragmatic level, initiative can be assessed
by the balance between contributions (e.g., dialogue acts) from each participant.
The dialogue manager’s ability to correct misunderstandings can, for example, be
quantified by counting occurrences of corrected misunderstanding and meta-
communication; several metrics for this purpose are listed in [24].
Contextual appropriateness. This can be related to Grice’s cooperativity principle
[5, 18] and quantified in terms of violations of it—for example, via the contextual
appropriateness parameter [16].
Output modality appropriateness. Similar to the input side, output modality appro-
priateness can be checked on the basis of modality properties [4], taking into
account the interrelations between simultaneous modalities. For example, tex-
tual information may be presented either visually or auditorily but not simulta-
neously audiovisually in order to not confuse the user [41].
Form appropriateness. This refers to the surface form of the output provided to the
user. For example, form appropriateness of spoken output can be measured via
its intelligibility, comprehensibility, or required listening effort. The appropriate-
ness of an embodi ed conversational agent can be assessed by its ability to convey
specific information, including emotions, turn taking backchannels, and so forth.
On the users side, interaction performance can be quantified by the ef fort
required by the user to interact with the system, as well as by the freedom of
interaction. Aspects include
Per ceptual effort. The effort required to decode system messages and understand
and interpret their meaning [51] (e.g., listening or reading effort). Metrics: the
Borg scale [8].
Cognitive workload. The costs of task performance (e.g., necessary information
processing capacity and resources) [47]. An overview of subjective and objec-
tive methods for assessing cognitive workload is given in [50].
Response effort. The physical effort required to communicate with the system—
for example: that nec essary to enter information into a mobile phone.
Metrics: questionnaires and the like [26].
So far, we have limited ourselves to influencing factors and performance metrics.
However, the ultimate aim of a system developer should be to satisfy the user, or
at least to provide acceptable services. According to Hassenzahl and colleagues
[20] the user’s evaluation of a system is influenced by pragmatic and hedonic quality
aspects. These have to be evaluated with the help of real or test users providing
judgments on what they perceive. Such judgments can be seen as “direct” quality
14.5 Quality Aspects 353

Get Human-Centric Interfaces for Ambient Intelligence now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.