PCA working methodology is described in the following sample data, which has two dimensions for each instance or data point. The objective here is to reduce the 2D data into one dimension (also known as the principal component):
Instance |
X |
Y |
1 |
0.72 |
0.13 |
2 |
0.18 |
0.23 |
3 |
2.50 |
2.30 |
4 |
0.45 |
0.16 |
5 |
0.04 |
0.44 |
6 |
0.13 |
0.24 |
7 |
0.30 |
0.03 |
8 |
2.65 |
2.10 |
9 |
0.91 |
0.91 |
10 |
0.46 |
0.32 |
Column mean |
0.83 |
0.69 |
The first step, prior to proceeding with any analysis, is to subtract the mean from all the observations, which removes the scale factor of variables and makes them more uniform across dimensions.
X |
Y |
0.72 ... |