Imputing the missing values in data

Imputing is the more involved method of dealing with missing values. By imputing, we refer to the act of filling in missing data values with numerical quantities that are somehow ascertained from existing knowledge/data. We have a few options on how we can fill in these missing values, the most common of them being filling in missing values with the average value for the rest of the column, as shown in the following code:

pima.isnull().sum()  # let's fill in the plasma columntimes_pregnant                    0
plasma_glucose_concentration      5
diastolic_blood_pressure         35
triceps_thickness               227
serum_insulin                   374
bmi                              11
pedigree_function                 0
age                               0
onset_diabetes                    0
dtype: int64

Let's look at the five rows where plasma_glucose_concentration ...

Get Feature Engineering Made Easy now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.