An encoder is basically an RNN with LSTM or GRU cells. It can also be a bidirectional RNN. We feed the input sentence to an encoder and, instead of taking the output, we take the hidden state from the final time step as the embeddings. Let's better understand encoders with an example.
Consider we are using an RNN with a GRU cell and the input sentence is what are you doing. Let's represent the hidden state of the encoder with e:
The preceding diagram shows how the encoder computes the thought vectors; this is explained as follows:
- In the first time step, . To a GRU cell, we pass the input, , which is the first word in the input sentence, ...