This section will describe some of the repercussions of the vanishing gradient problem:
- This problem occurs when we train a neural network model using some sort of optimization techniques which are gradient based.
- Generally, adding more hidden layers tends to make the network able to learn more complex arbitrary functions, and thus do a better job in predicting future outcomes. Deep Learning makes a big difference due to the large number of hidden layers it has, ranging from 10 to 200. It is now possible to make sense of complicated sequential data, and perform tasks such as Speech Recognition, Image Classification, Image Captioning, and more.
- The problem caused by the preceding steps is that, in some cases, the gradients ...