O'Reilly logo

R Data Analysis Cookbook - Second Edition by Kuntal Ganguly

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

How to do it...

Perform the following steps to apply data reduction using PCA:

  1. First, let's read the dataset, modify the row names to state names, and visualize its contents:
> usArrests=read.csv("USArrests.csv",stringsAsFactors = FALSE) > rownames(usArrests)=usArrests$X> usArrests$X=NULL
> head(usArrests)           Murder Assault UrbanPop Rape   Alabama  13.2   236      58     21.2   Alaska   10.0   263      48     44.5   Arizona  8.1    294      80     31.0   Arkansas 8.8    190      50     19.5 California 9.0    276      91     40.6   Colorado 7.9    204      78     38.7
  1. Calculating the variance row-wise to see how each variable is varying, we can observe that Assault has the most variance:
> apply(usArrests , 2, var)  Murder    Assault    UrbanPop Rape   18.97047 6945.16571 209.51878 87.72916 
  1. To overcome the magnitude of the variables' ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required