August 2019
Intermediate to advanced
342 pages
9h 35m
English
To solve the problem of missing values in datasets, the scikit-learn library provides specialized classes for the purpose.
The strategy followed by scikit-learn consists of the imputation of the missing values by inferring new values from the known part of the dataset.
The value imputation can be of two types:
In the case of univariate imputation, the SimpleImputer class is used. This allows you to replace null values with a constant value, or with a positional statistics metric, such as mean, median, or mode, which is calculated on the remaining values of the column that contains the null value.
In the following example, we see the replacement of null values (encoded ...
Read now
Unlock full access