Chapter 9Multimodal Input for Perceptual User Interfaces

Joseph J. LaViola Jr., Sarah Buchanan and Corey Pittman

University of Central Florida, Orlando, Florida

9.1 Introduction

Ever since Bolt's seminal paper, “Put that there: Voice and Gesture at the Graphics Interface”, the notion that multiple modes of input could be used to interact with computer applications has been an active area of human computer interaction research [1]. This combination of different forms of input (e.g., speech, gesture, touch, eye gaze) is known as multimodal interaction, and its goal is to support natural user experiences by providing the user with choices in how they can interact with a computer. These choices can help to simplify the interface, to provide more robust input when recognition technology is used, and to support more realistic interaction scenarios, because the interface can be more finely tuned to the human communication system. More formally, multimodal interfaces process two or more input modes in a coordinated manner which aim to recognize natural forms of human language and behavior and, typically, incorporate more than one recognition-based technology [2].

With the advent of more powerful perceptual computing technologies, multimodal interfaces that can passively sense what the user is doing are becoming more prominent. These interfaces, also called perceptual user interfaces [3], provide mechanisms that support unobtrusive interaction where sensors are placed in the physical ...

Get Interactive Displays: Natural Human-Interface Technologies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.