O'Reilly logo

Effective Amazon Machine Learning by Alexis Perrier

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Missing values

We have missing values for the age and fare variables with respective median values 28 and 14.5. We can replace all the missing values with the median values with this statement:

select coalesce(age, 28) as age, coalesce(fare, 14.5) as fare from titanic;

We also want to keep the information that there was a missing value at least for the age variable. We can do that with the query that creates a new binary variable that we name: is_age_missing:

SELECT age, CASE  WHEN age is null THEN 0 ELSE 1 END as is_age_missingfrom titanic;

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required