February 2018
Intermediate to advanced
262 pages
6h 59m
English
Decoder is a Long Short-Term Memory (LSTM) layer which will generate a caption for an image. To build a simple model, we can just pass the encoder embedding as input to the LSTM only once. But it could be quite challenging for the decoder to learn; instead, it is common practice to provide the encoder embedding at every step of the decoder. Intuitively, a decoder learns to generate a sequence of text that best describes the caption of a given image.