Normalizing or standardizing data in a data frame

Distance computations play a big role in many data analytics techniques. We know that variables with higher values tend to dominate distance computations and you may want to use the standardized (or Z) values.

Getting ready

Download the BostonHousing.csv data file and store it in your R environment's working directory. Then read the data:

> housing <- read.csv("BostonHousing.csv")

How to do it...

To standardize all the variables in a data frame containing only numeric variables, use:

> housing.z <- scale(housing)

You can only use the scale() function on data frames containing all numeric variables. Otherwise, you will get an error.

How it works...

When invoked as above, the scale() function computes the ...

Get R: Recipes for Analysis, Visualization and Machine Learning now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.