How it works...
In this recipe, we replaced missing values in the numerical and categorical variables of the Credit Approval Data Set with values extracted at random from the same variables using pandas and Feature-engine. First, we loaded the data and divided it into train and test sets using train_test_split(), as described in the Performing mean or median imputation recipe.
To perform random sample imputation using pandas, we calculated the number of missing values in the variable using pandas isnull(), followed by sum(). Next, we used pandas dropna() to drop missing information from the original variable in the train set so that we extracted values from observations with data using pandas sample(). We extracted as many observations as ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access