Principal Components Analysis

Another technique for analyzing data is principal components analysis. Principal components analysis breaks a set of (possibly correlated) variables into a set of uncorrelated variables.

In R, principal components analysis is available through the function prcomp in the stats package:

## S3 method for class 'formula':
prcomp(formula, data = NULL, subset, na.action, ...)

## Default S3 method:
prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE,
       tol = NULL, ...)

Here is a description of the arguments to prcomp.

ArgumentDescriptionDefault
formulaIn the formula method, specifies formula with no response variable, indicating columns of a data frame to use in the analysis. 
dataAn optional data frame containing the data specified in formula. 
subsetAn optional vector specifying observations to include in the analysis. 
na.actionA function specifying how to deal with NA values. 
xIn the default method, specifies a numeric or complex matrix of data for the analysis. 
retxA logical value specifying whether rotated variables should be returned.TRUE
centerA logical value specifying whether values should be zero centered.TRUE
scaleA logical value specifying whether values should be scaled to have unit variance.TRUE
tolA numeric value specifying a tolerance value below which components should be omitted.NULL
...Additional arguments passed to other methods. 

As an example, let’s try principal components analysis on a matrix of team batting statistics. Let’s start by loading the ...

Get R in a Nutshell, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.