February 2018
Intermediate to advanced
450 pages
11h 27m
English
This is also one of the common approaches because of its simplicity. In the case of a numerical feature, you can just replace the missing values with the mean or median. You can also use this approach in the case of categorical variables by assigning the mode (the value that has the highest occurrence) to the missing values.
The following code assigns the median of the non-missing values of the Fare feature to the missing values:
# handling the missing values by replacing it with the median faredf_titanic_data['Fare'][np.isnan(df_titanic_data['Fare'])] = df_titanic_data['Fare'].median()
Or, you can use the following code to find the value that has the highest occurrence in the Embarked feature and assign it to the ...
Read now
Unlock full access