Skip to Content
Learning Data Science
book

Learning Data Science

by Sam Lau, Joseph Gonzalez, Deborah Nolan
September 2023
Beginner
596 pages
15h 31m
English
O'Reilly Media, Inc.
Content preview from Learning Data Science

Chapter 20. Numerical Optimization

At this point in the book, our modeling procedure should feel familiar: we define a model, choose a loss function, and fit the model by minimizing the average loss over our training data. We’ve seen several techniques to minimize loss. For example, we used both calculus and a geometric argument in Chapter 15 to find a simple expression for fitting linear models using squared loss.

But empirical loss minimization isn’t always so straightforward. Lasso regression, with the addition of the L 1 penalty to the average squared loss, no longer has a closed-form solution, and logistic regression uses cross-entropy loss to fit a nonlinear model. In these cases, we use numerical optimization to fit the model, where we systematically choose parameter values to evaluate the average loss in search of the minimizing value.

When we introduced loss functions in Chapter 4, we performed a simple numerical optimization to find the minimizer of the average loss. We created a grid of θ values and evaluated the average loss at all points in the grid (see Figure 20-1). The grid point with the smallest average loss we took as the best fit. Unfortunately, this sort of grid search quickly becomes impractical, for the following reasons:

  • For complex models with many features, the grid becomes unwieldy. With only four features and a grid of 100 values for each feature, we must evaluate the average loss at 100 4 = 100,000,000 grid points.

  • The range of parameter ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Dive Into Data Science

Dive Into Data Science

Bradford Tuckfield
Introducing Data Science

Introducing Data Science

Arno Meysman, Davy Cielen, Mohamed Ali

Publisher Resources

ISBN: 9781098112998Errata PageSupplemental Content