Keras autoencoder example — sentence vectors

In this example, we will build and train an LSTM-based autoencoder to generate sentence vectors for documents in the Reuters-21578 corpus (https://archive.ics.uci.edu/ml/datasets/Reuters-21578+Text+Categorization+Collection). We have already seen in Chapter 5, Word Embeddings, how to represent a word using word embeddings to create vectors that represent its meaning in the context of other words it appears with. Here, we will see how to build similar vectors for sentences. Sentences are a sequence of words, so a sentence vector represents the meaning of the sentence.

The easiest way to build a sentence vector is to just add up the word vectors and divide by the number of words. However, this treats ...

Get Deep Learning with Keras now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.