book

Designing Voice User Interfaces

Name: Designing Voice User Interfaces
Author: Cathy Pearl
ISBN: 9781491955369

by Cathy Pearl

December 2016

Intermediate to advanced

278 pages

6h 21m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Dedication
Praise for Designing Voice User Interfaces
Preface
Why Write This Book?The Chinese Room and the Turing TestWho Should Read This BookHow This Book Is OrganizedO’Reilly SafariHow to Contact UsAcknowledgments
1. Introduction
A Brief History of VUIsThe Second Era of VUIsWhy Voice User Interfaces?Conversational User InterfacesAn Interview with AlexaWhat Is a VUI Designer?ChatbotsConclusion
2. Basic Voice User Interface Design Principles
Designing for Mobile Devices Versus IVR SystemsConversational DesignSetting User ExpectationsDesign ToolsSample DialogsVisual Mock-UpsFlowPrototyping ToolsConfirmationsMethod 1: Three-Tiered ConfidenceMethod 2: Implicit ConfirmationMethod 3: Nonspeech ConfirmationMethod 4: Generic ConfirmationMethod 5: Visual ConfirmationCommand-and-Control Versus ConversationalCommand-and-ControlConversationalConversational MarkersError HandlingNo Speech DetectedSpeech Detected but Nothing RecognizedRecognized but Not HandledRecognized but IncorrectlyEscalating ErrorDon’t Blame the UserNovice and Expert UsersKeeping Track of ContextHelp and Other UniversalsLatencyDisambiguationDesign DocumentationPromptsGrammars/Key PhrasesAccessibilityInteraction Should Be Time-EfficientKeep It ShortTalk Faster!Interrupt Me at Any TimeProvide ContextWhere Am I?Text-to-Speech PersonalizationConclusion
3. Personas, Avatars, Actors, and Video Games
PersonasShould My VUI Be Seen?Using an Avatar: What Not to DoUsing an Avatar (or Recorded Video): What to DoStorytellingTeamworkVideo GamesWhen Should I Use Video in My VUI?Visual VUI—Best PracticesShould My Users See Themselves?What About the GUI?Handling ErrorsTurn Taking and Barge-InMaintaining Engagement and the Illusion of AwarenessVisual (Non-Avatar) FeedbackChoosing a VoicePros of an AvatarThe Downsides of an AvatarThe Uncanny ValleyConclusion
4. Speech Recognition Technology
Choosing an EngineBarge-InTimeoutsEnd-of-speech timeoutNo speech timeoutToo much speechN-Best ListsThe Challenges of Speech RecognitionNoiseMultiple SpeakersChildrenNames, Spelling, and AlphanumericData PrivacyConclusion
5. Advanced Voice User Interface Design
Branching Based on Voice InputConstrained ResponsesOpen SpeechCategorization of InputWildcards and Logical ExpressionsDisambiguationNot Enough InformationMore Than One Piece of Information When Only One Is ExpectedHandling NegationCapturing Intent and ObjectsDialog ManagementDon’t Leave Your User HangingShould the VUI Display What It Recognized?Sentiment Analysis and Emotion DetectionText-to-Speech Versus Recorded SpeechSpeaker Verification“Wake” WordsContextAdvanced MultimodalBootstrapping DatasetsWebsite dataCall center dataData collectionAdvanced NLUConclusion
6. User Testing for Voice User Interfaces
Special VUI ConsiderationsBackground Research on Users and Use CasesDon’t Reinvent the WheelDesigning a Study with Real UsersTask DefinitionChoosing ParticipantsQuestions to AskOpen responses (to be asked verbally)Things to Look ForEarly-Stage TestingSample DialogsMock-upsWizard of Oz TestingDifference Between WOz and Usability TestingUsability TestingRemote TestingModerated versus unmoderatedVideo recordingServices for remote testingLab TestingGuerrilla TestingPerformance MeasuresNext StepsTesting VUIS in Cars, Devices, and RobotsCarsDevices and RobotsConclusion
7. Your Voice User Interface Is Finished! Now What?
Prerelease TestingDialog Traversal TestingRecognition TestingLoad TestingMeasuring PerformanceTask Completion RatesDropout RateOther Items to TrackAmount of time in the VUIBarge-inSpeech versus GUIHigh no-speech timeouts, no matchesNavigationLatencyWhole call recordingLoggingTranscriptionRelease PhasesPilotSurveysAnalysisConfidence ThresholdsEnd-of-Speech TimeoutsInterim Results versus Final ResultsCustom DictionariesPromptsToolsRegression TestingConclusion

8. Voice-Enabled Devices and Cars
DevicesHome AssistantsWatches/Bands/EarbudsOther DevicesCars and Autonomous VehiclesChallenges of Designing VUI for the CarDesigning for in the CarDistracted DrivingDevice ShiftingInteraction ModeConclusions on CarsConclusion
A. Epilogue
B. Products Mentioned in This Book
Mobile Phone AssistantsHome AssistantsToys/OtherAppsVideo GamesWatches / BandsCars
C. About the Author
Index
About the Author
Colophon
Copyright

Content preview from Designing Voice User Interfaces

Preface

WE LIVE IN A MAGICAL TIME. While lounging on my living room sofa, using only my voice I can order a pound of gummy bears to be delivered to my door within two hours. (Whether or not it’s a good thing that I can do this is a discussion for another book.)

The technology of speech recognition—having a computer understand what you say to it—has grown in leaps and bounds in the past few years. In 1999, when I began my career in voice user interface (VUI) design at Nuance Communications, I was amazed that a computer could understand the difference between me saying “checking” versus “savings.” Today, you can pick up your mobile phone—another magical device—and say, “Show me coffee shops within two miles that have WiFi and are open on Sundays,” and get directions to all of them.

In the 1950s, when computers were beginning to spark people’s imaginations, the spoken word was considered to be a relatively easy problem. “After all,” it was thought, “even a two-year-old can understand language!”

As it turns out, comprehending language is quite complex. It’s filled with subtleties and idiosyncrasies that take humans takes years to master. Decades were spent trying to program computers to understand the simplest of commands. It was believed by some that only an entity that lived in the physical world could ever truly understand language, because without context it is impossible to understand the meaning behind the words.

Speech recognition was around in science fiction long before it came ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781491955406Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Designing Voice User Interfaces

by Cathy Pearl

Preface

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.