O'Reilly logo

Hands-On Natural Language Processing with Python by Rajalingappaa Shanmugamani, Rajesh Arumugam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

The attention-based decoder

Here, the decoder uses an attention mechanism on the encoder's output, to produce mel-spectrogram frames. 

For each mini-batch, the decoder is given a GO frame, which contains only zeros as input. Then, for every time-step, the previously predicted mel-spectrogram frame is used as input. This input fuels a pre-net that has the exact same architecture as the encoder's pre-net. 

The pre-net is followed by a one-layer GRU, whose output is concatenated with the encoder's output to produce the context vector through the attention mechanism. This GRU output is then concatenated with the context vector, to produce the input of the decoder RNN block.

The decoder RNN is a two-layer residual GRU that uses vertical residual ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required