O'Reilly logo

Hands-On Natural Language Processing with Python by Rajalingappaa Shanmugamani, Rajesh Arumugam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Overview of the DeepSpeech model

The model consists of a stack of fully connected hidden layers followed by a bidirectional RNN and with additional hidden layers at the output. The first three nonrecurrent layers act like a preprocessing step to the RNN layer. One addition is the use of clipped rectified linear units (ReLUs) to prevent the activations from exploding. The input audio feature is the Mel cepstrum coefficients that the nonrecurrent layers see in time slices of spectrograms. In addition to the usual time slices, the spectrum data is preprocessed to include past and future contexts. The fourth layer is the RNN layer which has both a forward recurrence and a backward recurrence. The fifth layer takes the concatenated outputs of ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required