Design issues for mobile systems 85
As you may expect, speaker-dependent systems have a greater recog-
nition accuracy than speaker-independent systems. Noyes (2001) high-
lights the fact that current speaker-independent recognition systems
have problems handling large vocabularies and multiple word input.
This finding is supported by the fact that many speech-based auto-
mated phone services use a numbered menu style. This constrains
the options that the user can use (e.g. ‘for cinema listings, say 1, for
ticket prices, say 2’) resulting in high recognition rates but longer
interaction times, which will have an impact on the user’s mobile
phone bill.
Another important area to consider here, are the menu issues in relation
to automated mobile phone services. Let’s consider these in a bit more
Menu issues for automated mobile phone services
Automated mobile phone services usually have a structured menu hier-
archy but do not always provide the user with navigational tips. For
example, people can call an automated mobile phone service that has
a long list of menu choices, but which overloads the user’s short-term
memory (if you are not sure about this term, refer back to Chapter 2 on
individual indifferences) and as a result they become lost, confused and
irritated with the system.
Why is this a problem? Because of the difference between speech inter-
faces and graphical user interfaces. A graphical user interface (GUI) can
display information via graphics, text, icons, menus, video and audio.
These different modes can be used as short-term memory aids and navi-
gational cues; for example, the user can scan menu lists until they find the
option they require. However, with a speech-based interface, there are no
short-term memory aids and information can only be presented in a serial
Therefore speech systems have a dual function: providing informa-
tion and navigation cues. This, according to Brewster (1997) is the main
cause of navigational problems associated with speech-based phone
So how can these short-term memory overload problems be addressed?
Three approaches that could help reduce this burden on the user are: con-
versational dialogues, earcons and metaphors. Let’s have a look at each of
these in turn.
H6352-Ch05.qxd 7/18/05 12:40 PM Page 85
Conversational interfaces
The best way to define a conversational style dialogue is to give you a
description of one of the earliest examples of this kind of interface style.
Schmandt (1987) described an automated telephone service called the
Phone Slave which was an answering machine service that allowed users
to retrieve stored messages. The system worked on the basis of asking the
caller a series of questions like ‘who’s calling please?’ and ‘what’s this
in reference to?’. The Phone Slave has no understanding of the content of
the messages left by any of the callers. The users’ responses were stored
digitally by the service and could be accessed in sequential order by
the system’s owner. For example, the owner could ask the service ‘who
left messages?’ and the system would respond by playing back all the
responses to its own query ‘who’s calling please?’.
Schmandt found that the interface was very effective in eliciting appro-
priate voice message components from callers, attributing the success of
the conversational style to the apparent high quality of the spoken prompts
provided by the system. To take a message requires co-operative behav-
iour and there is no reason to think that callers will not follow conventional
rules. By asking a series of questions, as opposed to a message such as
‘leave your message after the beep’, the system makes it easier for the user
to leave a more complete message as a series of components. In doing
so, the system maintains its ability to control the conversation and protects
the system’s limited ‘intelligence’ from being exposed.
Other examples of this type of approach can be seen with the Philips
TABA train timetable information service (Souvignier et al., 2000) and
the SpeechWorks air travel reservation system (Barnard et al., 1999).
Both these systems rely on system initiated dialogue to keep the speech
recognition accuracy levels high.
However, there are still some problems to be overcome for natural
language conversational interfaces, such as increasing the number of
words that can successfully be recognised as a part of the service’s voca-
bulary, and effective recovery methods when misrecognition occurs. It’s
important that the service gets the user back on track and that the user
has confidence in the system’s ability to understand what they are saying.
Blattner et al. (1989) define earcons as being abstract musical tones that
can be combined to produce sound messages to represent parts of
86 Understanding Mobile Human–Computer Interaction
H6352-Ch05.qxd 7/18/05 12:40 PM Page 86

Get Understanding Mobile Human-Computer Interaction now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.