7

Understanding Long Short-Term Memory Networks

In this chapter, we will discuss the fundamentals behind a more advanced RNN variant known as Long Short-Term Memory Networks (LSTMs). Here, we will focus on understanding the theory behind LSTMs, so we can discuss their implementation in the next chapter. LSTMs are widely used in many sequential tasks (including stock market prediction, language modeling, and machine translation) and have proven to perform better than older sequential models (for example, standard RNNs), especially given the availability of large amounts of data. LSTMs are designed to avoid the problem of the vanishing gradient that we discussed in the previous chapter.

The main practical limitation posed by the vanishing gradient ...

Get Natural Language Processing with TensorFlow - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.