book

Google Cloud Platform in Action

by John J. (JJ) Geewax

September 2018

Intermediate to advanced

632 pages

21h 40m

English

Manning Publications

Read now

Unlock full access

Content preview from Google Cloud Platform in Action

Chapter 16. Cloud Speech: audio-to-text conversion

This chapter covers

An overview of speech recognition
How the Cloud Speech API works
How Cloud Speech pricing is calculated
An example of generating automated captions from audio content

When we talk about speech recognition, we generally mean taking an audio stream (for example, an MP3 file of a book on tape) and turning it into text (in this case, back into the actual written book). This process sounds straightforward, but as you may know, language is a particularly tricky human construct. For instance, the psychological phenomenon called the McGurk effect changes what we hear based on what we see. In one classic example, the sound “ba” can be perceived as “fa” so long as we see someone’s ...