Stochastic gradient descent
Gradient descent is the means by which we find the global minima in our loss function, and it's how neural networks learn. In gradient descent, we calculate the value of individual gradients, or slopes of the loss function. This helps us reach our minima. Think about descending a hill blindfolded; the only way you can reach the bottom is by feeling the slope of the ground. In gradient descent, we use calculus to feel what the slope of the ground is, to make sure we are headed in the right direction towards our minima's.
In bland old gradient descent, we have to calculate the loss of every single sample that is being passed into the network at a given time, resulting in many redundant calculations.
We can mitigate ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access