O'Reilly logo

Hands-On Natural Language Processing with Python by Rajalingappaa Shanmugamani, Rajesh Arumugam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Preprocessing the audio data

The MFCC features are extracted from the audio data just like in our previous example. In addition to that, we also add the context that was used in the original paper:

def audiofile_to_vector(audio_fname, n_mfcc_features, nctx):    sampling_rate, raw_w = wavfile.read(audio_fname)    mfcc_ft = mfcc(raw_w, samplerate=sampling_rate, numcep=n_mfcc_features)    mfcc_ft = mfcc_ft[::2]    n_strides = len(mfcc_ft)    dummy_ctx = np.zeros((nctx, n_mfcc_features), dtype=mfcc_ft.dtype)    mfcc_ft = np.concatenate((dummy_ctx, mfcc_ft, dummy_ctx))    w_size = 2*nctx+1    input_vector = np.lib.stride_tricks.as_strided(mfcc_ft,(n_strides, w_size,                    n_mfcc_features,(mfcc_ft.strides[0], mfcc_ft.strides[0], mfcc_ft.strides[1]),                    writeable=False) input_vector ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required