For example, two classification classes can be obtained by finding the mean value of the
feature with largest
i
.
8. For compression, reduce the dimensionality of the new feature vectors by setting to zero
components with low
i
values. Features in the original data space can be obtained by
c
T
X
=W
T
c
T
Y
.
12.10 Example
Code 12.1 is a Matlab implementation of PCA, illustrating the method by a simple example
with two features in the matrix cx.
In the example code, the covariance matrix is called CovX and it is computed by the Matlab
function cov. The code also computes the covariance by evaluating the two alternative definitions
given by Equations 12.22 and 12.23. Notice that the implementation of these equations divides
the matrix multiplication by m 1 instead of m. In statistics, this is called an unbiased estimator
and it is the estimator used by Matlab in the function cov. Thus, we use m 1 to obtain the
same covariance values as the Matlab function.
To solve the eigenproblem, we use the Matlab function eig. This function solves the
characteristic equation det
i
I
X
=0 to obtain the eigenvalues and find the eigenvectors. In
the code the results of this function are stored in the matrices L and W, respectively. In general,
the characteristic equation defines a polynomial of higher degree requiring elaborate numerical
methods to find its solution. In our example, we have only two features, thus the characteristic
equation defines the quadratic form
2
i
1208
i
+0039 = 0 (12.45)
for which the eigenvalues can be easily obtained as
1
=00331 and
1
=1175. The eigenvectors
can be obtained by substitution of these values in the eigenproblem. For example, for the first
eigenvector, we have:
0033 0543 0568
0568 0033 0665
w
1
=0 (12.46)
Thus,
w
1
=
111s
s
(12.47)
where s is an arbitrary constant. After normalizing this vector, we obtain the first eigenvector
w
1
=
074
066
(12.48)
Similarly, the second eigenvector is obtained as
w
2
=
066
074
(12.49)
Appendix 4: Principal components analysis 393
%PCA
%Feature Matrix cx. Each column represents a feature and
%each row a sample data
cx= [1.4000 1.55000
3.0000 3.2000
0.6000 0.7000
2.2000 2.3000
1.8000 2.1000
2.0000 1.6000
1.0000 1.1000
2.5000 2.4000
1.5000 1.6000
1.2000 0.8000
2.1000 2.5000 ];
[m,n]=size(cx);
%Data Graph
figure(1);
plot(cx(:,1),cx(:,2),'k+'); hold on; %Data
plot(([0,0]),([-1,4]),'k-'); hold on; %X axis
plot(([-1,4]),([0,0]),'k-'); %Y axis
axis([-1,4,-1,4]);
xlabel('Feature 1');
ylabel('Feature 2');
title('Original Data');
%Covariance Matrix
covX=cov(cx)
%Covariance Matrix using the matrix definition
meanX=mean(cx) %mean of all elements of each row
cx1=cx(:,1)-meanX(1); %substract mean of first row in cx
cx2=cx(:,2)-meanX(2); %substract mean of second row in cx
Mcx=[cx1 cx2];
covX=(transpose(Mcx)*(Mcx))/(m-1) %definition of covariance
%Covariance Matrix using alternative definition
meanX=mean(cx); %mean of all elements of each row
cx1=cx(:,1); %substract mean of first row in cx
cx2=cx(:,2); %substract mean of second row in cx
covX=((transpose(cx)*(cx))/(m-1) )-
((transpose(meanX)*meanX)*(m/(m-1)))
[W,L]=eig(covX) %W=Eigenvalues L=Eigenvector
%Compute Eigenvalues and Eigenvector
394 Feature Extraction and Image Processing
%Eigenvector Graph
figure(2);
plot(cx(:,1),cx(:,2),'k+'); hold on;
plot(([0,W(1,1)*4]),([0,W(1,2)*4]),'k-'); hold on;
plot(([0,W(2,1)*4]),([0,W(2,2)*4]),'k-');
axis([-4,4,-4,4]);
xlabel('Feature 1');
ylabel('Feature 2');
title('Eigenvectors');
%Transform Data
cy=cx*transpose(W)
%Graph Transformed Data
figure(3);
plot(cy(:,1),cy(:,2),'k+'); hold on;
plot(([0,0]),([-1,5]),'k-'); hold on;
plot(([-1,5]),([0,0]),'k-');
axis([-1,5,-1,5]);
xlabel('Feature 1');
ylabel('Feature 2');
title('Transformed Data');
%Classification example
meanY=mean(cy);
%Graph of classification example
figure(4);
plot(([-5,5]),([meanY(2),meanY(2)]),'k:'); hold on;
plot(([0,0]),([-5,5]),'k-'); hold on;
plot(([-1,5]),([0,0]),'k-'); hold on;
plot(cy(:,1),cy(:,2),'k+'); hold on;
axis([-1,5,-1,5]);
xlabel('Feature 1');
ylabel('Feature 2');
title('Classification Example');
legend('Mean',2);
%Compression example
cy(:,1)=zeros;
xr=transpose(transpose(W)*transpose(cy));
%Graph of compression example
figure(5);
plot(xr(:,1),xr(:,2),'k+'); hold on;
plot(([0,0]),([-1,4]),'k-'); hold on;
plot(([-1,4]),([0,0]),'k-');
axis([-1,4,-1,4]);
xlabel('Feature 1');
ylabel('Feature 2');
title('Compression Example');
Code 12.1 Matlab PCA implementation
Appendix 4: Principal components analysis 395
Figure 12.1 shows the original data and the eigenvectors. The eigenvector with the largest
eigenvalue defines a line that goes through the points. This is the direction of the largest variance
of the data.
4
3.5
3
2.5
2
1.5
1
0.5
0
–0.5
–1
–1 –0.5
0
0.5
1 1.5
2
2.5 3
3.5
4
Feature 1
Original data
Feature 2
–1 –3 –2 –1 0 1 2 3 4
Feature 1
Eigenvectors
4
3
2
1
0
–1
–2
–3
–4
Feature 2
(a) Original data
(b) Eigenvectors
Figure 12.1 Data samples and the eigenvectors
Figure 12.2 shows the results obtained by transforming the features c
Y
=c
X
W
T
. Basically, the
eigenvectors become our main axes. The second feature has points more spread along the axis,
and this is related to a higher value in the eigenvector. Remember that for the transformed data,
the covariance matrix is diagonal, thus there is not any linear dependence between the features.
5
4
3
2
1
0
012
Feature 1
Feature 2
Transformed data
345
–1
–1
Figure 12.2 Transformed data
396 Feature Extraction and Image Processing

Get Feature Extraction & Image Processing, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.