Human Factors in Speech

For many years, a group of human factors specialists have studied the implications of speech technology on human-computer interaction. Going back to the 1940s, a body of knowledge has accumulated dealing with how human capabilities intersect with speech technology. For example, what can the human ear hear? How well do people recognize speech? What characteristics of speech make it hard to understand?

In addition to physiological aspects of human factors, there are the cognitive and psychological aspects of humans interacting with speech technology in computers. For example, what constraints must users observe in their speech so that a speech recognizer can understand them? Does constraining their speech make them more or less effective? Does it change the way they work? How do people react to synthesized speech? Do they mind if the computer sounds like a computer? Do they prefer that it sound like a human? How does computer speech affect task performance? Now add to this the aspect of multi-modality. Some speech technology involves speech only, but a significant portion of the interfaces being designed with speech are multi-modal. They involve not just speech, but other modes, such as tactile or visual. For example, a desktop dictation system involves speaking to the computer, and possibly using a mouse and keyboard to make corrections. Speech added to a personal digital assistant (PDA) handheld device means that people will be speaking while looking at ...

Get Designing Effective Speech Interfaces now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.