If we look into the update method of gradient descent, we will see an equation like this:
Here, is the parameter at time step t, is the gradient of loss at t, and is the learning rate at time t.
A cell update equation of an LSTM cell, on the other hand, looks something like this:
This update looks very similar ...