
108 CHAPTER 4 Feature Selection
Solution. Generate the data set.
randn('seed',0);
m=1; var=0.16;
stdevi=sqrt(var);
norm_dat=m+stdevi*randn(1,100);
Generate the outliers.
outl=[6.2 -6.4 4.2 15.0 6.8];
Add outliers at the end of the data.
dat=[norm_dat';outl'];
Scramble the data.
rand('seed',0); % randperm() below calls rand()
y=randperm(length(dat));x=dat(y);
Identify outliers and their corresponding indices.
times=1; % controls the tolerance threshold
[outliers,Index,new_dat]=simpleOutlierRemoval(x,times);
[outliers Index]
The new_dat file contai ns the data after the outliers have been rejected. The program output should
look like thi s:
outliers index
4.2 3
6.8 ...