O'Reilly logo

Hands-On Natural Language Processing with Python by Rajalingappaa Shanmugamani, Rajesh Arumugam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Full architecture, with attention

Now, let's combine the previously defined functions to form the full Tacotron model.

But first, let's define some extra parameters that characterize the network:

NB_CHARS_MAX = 200 # maximum length of the input textEMBEDDING_SIZE = 256 K1 = 16 # number of 1-D convolution blocks in the encoder CBHGHK2 = 8 # number of 1-D convolution blocks in the postprocessing CBHGBATCH_SIZE = 32
Note that the model is defined by two input objects and two output objects.

The two input objects correspond to the encoder input and the decoder input. The former is expected to be the input text. The latter should be the last mel-spectrogram frame, among the r frames predicted by the decoder before the postprocessing CBHG. The ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required