Chapter 7. Long Short-Term Memory Networks

In this chapter, we will discuss a more advanced RNN variant known as Long Short-Term Memory Networks (LSTMs). LSTMs are widely used in many sequential tasks (including stock market prediction, language modeling, and machine translation) and have proven to perform better than other sequential models (for example, standard RNNs), especially given the availability of large amounts of data. LSTMs are well-designed to avoid the problem of the vanishing gradient that we discussed in the previous chapter.

The main practical limitation posed by the vanishing gradient is that it prevents the model from learning long-term dependencies. However, by avoiding the vanishing gradient problem, LSTMs have the ability ...

Get Natural Language Processing with TensorFlow now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.