Long short-term memory

Hochreiter and Schmidhuber in 1997 proposed a modified RNN model, called the long short-term memory (LSTM) as a solution to overcome the vanishing gradient problem. The hidden layer in the RNNs is replaced by an LSTM cell. The LSTM cell consists of three gates: forget gate, input gate, and the output gate. These gates control the amount of long-term memory and the short-term memory generated and retained by the cell. The gates all have the sigmoid function, which squashes the input between 0 and 1. Following, we see how the outputs from various gates are calculated, in case the expressions seem daunting to you, do not worry, we will be using the TensorFlow tf.contrib.rnn.BasicLSTMCell and tf.contrib.rnn.static_rnn

Get Hands-On Artificial Intelligence for IoT now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.