December 2018
Intermediate to advanced
158 pages
3h 58m
English
The following is the LSTM model class we will use for MNIST:

Notice that nn.LSTM is passed the same arguments as the previous RNN. This is not surprising, since LSTM is a recurrent network that works on sequences of data. Remember the input tensor has an axis of the form (batch, sequence, feature), so we set batch_first = True. We initialize a fully connected linear layer for the output layer. Notice in the forward method that, as well as initializing a hidden state tensor, h0, we also initialize a tensor to hold the cell state, c0. Remember also the out tensor contains all 28 time steps. For our prediction, we are only ...
Read now
Unlock full access