In this example, we will build and train an LSTM-based autoencoder to generate sentence vectors for documents in the Reuters-21578 corpus (https://archive.ics.uci.edu/ml/datasets/Reuters-21578+Text+Categorization+Collection). We have already seen in Chapter 5, Word Embeddings, how to represent a word using word embeddings to create vectors that represent its meaning in the context of other words it appears with. Here, we will see how to build similar vectors for sentences. Sentences are a sequence of words, so a sentence vector represents the meaning of the sentence.
The easiest way to build a sentence vector is to just add up the word vectors and divide by the number of words. However, this treats ...