PCA working methodology from first principles

PCA working methodology is described in the following sample data, which has two dimensions for each instance or data point. The objective here is to reduce the 2D data into one dimension (also known as the principal component):

Instance

X

Y

1

0.72

0.13

2

0.18

0.23

3

2.50

2.30

4

0.45

0.16

5

0.04

0.44

6

0.13

0.24

7

0.30

0.03

8

2.65

2.10

9

0.91

0.91

10

0.46

0.32

Column mean

0.83

0.69

 

The first step, prior to proceeding with any analysis, is to subtract the mean from all the observations, which removes the scale factor of variables and makes them more uniform across dimensions.

X

Y

0.72 ...

Get Statistics for Machine Learning now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.