Improving LSTMs – generating text with words instead of n-grams

Here we will discuss ways to improve LSTMs. First, we will discuss how the number of model parameters grows if we use one-hot-encoded word features. This motivates us to use low-dimensional word vectors instead of one-hot-encoded vectors. Finally, we will discuss how we can employ word vectors in the code to generate better-quality text compared to using bigrams. The code for this section is available in lstm_word2vec.ipynb in the ch8 folder.

The curse of dimensionality

One major limitation stopping us from using words instead of n-grams as the input to our LSTM is that this will drastically increase the number of parameters in our model. Let's understand this through an example. Consider ...

Get Natural Language Processing with TensorFlow now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.