August 2018
Intermediate to advanced
272 pages
7h 2m
English
Now we have defined a loss function to be used; we can use this loss function to train our model. As is shown in the previous equations, the loss function is a function of weights and biases. Therefore, all we have to do is an exhaustive search of the space of weights and biases and see which combination minimizes the loss best. When we have one- or two-dimensional weight vectors, this process might be okay, but when the weight vector space gets too big, we need a more efficient solution. To do this, we will use an optimization technique called gradient descent.
By using our loss function and calculus, gradient descent is able to see how to adjust the values of the weights and biases of our model in such a way that the value ...