O'Reilly logo

Hands-On Natural Language Processing with Python by Rajalingappaa Shanmugamani, Rajesh Arumugam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Building an RNN model for speech recognition

We will be using the free-spoken digits audio dataset from https://github.com/Jakobovski/free-spoken-digit-dataset/tree/master/recordings for our basic model. Download the data to any directory on your system. In the example code, replace the path referring to the .wav file with the path you have copied the data to.

Note that we have split the data into training data which includes 1,470 files and 30 for the test set.

Before we get into the details of the model itself, we will look at how to prepare it for the training. The most common preprocessing step used in practice is to transform the raw audio data into its frequency spectrum. The frequency spectrum or power spectrum is like a fingerprint ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required