O'Reilly logo

Hands-On Natural Language Processing with Python by Rajalingappaa Shanmugamani, Rajesh Arumugam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

The Griffin-Lim-based postprocessing module

The purpose of the postprocessing block is to convert the predicted mel-spectrogram frame into the corresponding waveform. 

A CBHG module is used, right on top of the predicted mel-spectrogram frame, to extract both backward and forward features (thanks to the bidirectional GRU at the end), as well as to correct errors in the predicted frame. Thus, the raw spectrogram is predicted.

Even if a spectrogram is a good way to represent speech, it lacks information about the phase. Luckily, we have signal-processing algorithms such as Griffin-Lim (https://ieeexplore.ieee.org/document/1164317/), which can infer the likely speech waveform by estimating the phase from the spectrogram. It iteratively attempts ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required