We just learned how forward propagation works in RNNs and how it predicts the output. Now, we compute the loss, , at each time step, , to determine how well the RNN has predicted the output. We use the cross-entropy loss as our loss function. The loss at a time step can be given as follows:
Here, is the actual output, ...