8Unsupervised Learning

In Chapter 6 we discussed methods for reducing the dimension of the measurement space in order to decrease the cost of classification and to improve the ability to generalize. In these procedures it was assumed that for all training objects, class labels were available. In many practical applications, however, the training objects are not labelled, or only a small fraction of them are labelled. In these cases it can be worthwhile to let the data speak for itself. The structure in the data will have to be discovered without the help of additional labels.

An example is colour-based pixel classification. In video-based surveillance and safety applications, for instance, one of the tasks is to track the foreground pixels. Foreground pixels are pixels belonging to the objects of interest, for example cars on a parking place. The RGB representation of a pixel can be used to decide whether a pixel belongs to the foreground or not. However, the colours of neither the foreground nor the background are known in advance. Unsupervised training methods can help to decide which pixels of the image belong to the background and which to the foreground.

Another example is an insurance company, which might want to know if typical groups of customers exist, such that it can offer suitable insurance packages to each of these groups. The information provided by an insurance expert may introduce a significant bias. Unsupervised methods can then help to discover additional ...

Get Classification, Parameter Estimation and State Estimation, 2nd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.