Working with long short-term memory units

Another alternative to working with basic recurrent layers is to use a Long Short-Term Memory (LSTM) unit. This recurrent layer works with gates just like the GRU that we discussed in the previous section, except the LSTM has a lot more gates.

The following diagram outlines the structure of the LSTM layer:

The LSTM unit has a cell state that is central to how this layer type works. The cell state is kept over long periods of time and doesn't change much. The LSTM layer also has a hidden state, but this state serves a different role in the layer.

In short, the LSTM has a long-term memory modeled as ...

Get Deep Learning with Microsoft Cognitive Toolkit Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.