How to do it...

The strategy discussed above is coded as follows (the code file is available as Voice transcription.ipynb in GitHub):

  1. Download the dataset and import the relevant packages:
$wget http://www.openslr.org/resources/12/train-clean-100.tar.gz$tar xzvf train-clean-100.tar.gzimport librosaimport numpy as npimport pandas as pd
  1. Read all the file names and their corresponding transcriptions and turn them into separate lists:
import os, numpy as nporg_path = '/content/LibriSpeech/train-clean-100/'count = 0inp = []k=0audio_name = []audio_trans = []for dir1 in os.listdir(org_path):     dir2_path = org_path+dir1+'/'     for dir2 in os.listdir(dir2_path):     dir3_path = dir2_path+dir2+'/'      for audio in os.listdir(dir3_path): if audio.endswith('.txt'): ...

Get Neural Networks with Keras Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.