August 2018
Intermediate to advanced
378 pages
9h 9m
English
The basic concept of the L1 penalty, also known as the least-absolute shrinkage and selection operator (Lasso–Hastie, T., Tibshirani, R., and Friedman, J. (2009)), is that a penalty is used to shrink weights toward zero. The penalty term uses the sum of the absolute weights, so some weights may get shrunken to zero. This means that Lasso can also be used as a type of variable selection. The strength of the penalty is controlled by a hyper-parameter, alpha (λ), which multiplies the sum of the absolute weights, and it can be a fixed value or, as with other hyper-parameters, optimized using cross-validation or some similar approach.
It is easier to describe Lasso if we use an ordinary least squares (OLS) regression model. In regression, ...