November 2020
Intermediate to advanced
296 pages
9h 8m
English
This chapter covers

In the last chapter, you saw how you can determine parameter values through optimizing a loss function using stochastic gradient descent (SGD). This approach also works for DL models that have millions of parameters. But how did we arrive at the loss function? In the linear regression problem (see sections 1.4 and 3.1), we used the mean squared error (MSE) as a loss function. We don’t claim ...
Read now
Unlock full access