PCA working methodology from first principles

PCA working methodology is described in the following sample data, which has two dimensions for each instance or data point. The objective here is to reduce the 2D data into one dimension (also known as the principal component):

Instance

X

Y

1

0.72

0.13

2

0.18

0.23

3

2.50

2.30

4

0.45

0.16

5

0.04

0.44

6

0.13

0.24

7

0.30

0.03

8

2.65

2.10

9

0.91

0.91

10

0.46

0.32

Column mean

0.83

0.69

 

The first step, prior to proceeding with any analysis, is to subtract the mean from all the observations, which removes the scale factor of variables and makes them more uniform across dimensions.

X

Y

0.72 ...

Get Numerical Computing with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.