Chapter 9. The Web Speech API

Introduction

In the age of smart devices and assistants, your voice has become another commonly used input method. Whether you’re dictating a text message or asking for tomorrow’s weather forecast, speech recognition and synthesis are becoming useful tools in app development. With the Web Speech API, you can make your app speak or listen for a user’s voice input.

Speech Recognition

The Web Speech API brings speech recognition to the browser. Once the user gives you permission to use the microphone, it listens for speech. When it recognizes a series of words, it triggers an event with the recognized content.

Note

Speech recognition may not be supported by all browsers yet. See CanIUse for the latest compatibility data.

You’ll need the user’s permission before you can start listening for speech. Due to privacy settings, the first time you attempt to listen, the user is prompted to grant your app permission to use the microphone (see Figure 9-1).

A microphone permission request in Chrome
Figure 9-1. A microphone permission request in Chrome

Some browsers, such as Chrome, use an external server for analyzing the captured audio to recognize speech. This means speech recognition won’t work when you’re offline, and it might also raise privacy concerns.

Get Web API Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.