O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Standardizing data

There are a couple more transformations that we can perform on our hospital dataset.

We would like to standardize both the Total.Costs variable and the Total.Charges variable so that we can compare them more easily. Predictive models like to see variables standardized because there is less bias towards any particular variable, since all standardized variables have approximately the same magnitude.

The way we do this is by normalizing the means to 0, with a standard deviation of 1. Computational means that we will take each value, subtract the mean and divide the results by the standard deviation.

In previous examples, we have explicitly referred to variables using their full name (for example, df$Total.Costs, which refers ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required