March 2018
Intermediate to advanced
272 pages
7h 53m
English
I'm only going to use one LSTM layer here, with just 10 neurons, as shown in the following code:
lstm1 = LSTM(10, activation='tanh', return_sequences=False, dropout=0.2, recurrent_dropout=0.2, name='lstm1')(embedding)
Why am I using such a small LSTM layer? As as you're about to see, this model is going to struggle with overfitting. Even just 10 LSTM units are able to learn the training data a little too well. The answer to this problem is likely to add data, but we really can't, so keeping the network structure simple is a good idea.
That leads us to the use of dropout. I will use both dropout and recurrent dropout on this layer. We haven't talked about recurrent dropout yet so let's cover that now. Normal dropout, applied on ...