O'Reilly logo

Regression Analysis with Python by Alberto Boschetti, Luca Massaron

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Missing data

Missing data appears often in real-life data, sometimes randomly in random occurrences, more often because of some bias in its recording and treatment. All linear models work on complete numeric matrices and cannot deal directly with such problems; consequently, it is up to you to take care of feeding suitable data for the algorithm to process.

Even if your initial dataset does not present any missing data, it is still possible to encounter missing values in the production phase. In such a case, the best strategy is surely that of dealing with them passively, as presented at the beginning of the chapter, by standardizing all the numeric variables.

Tip

As for as indicator variables, in order to passively intercept missing values, a possible ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required