Part III Beyond supervised learning

10 Topics in unsupervised learning

So far, we have always assumed that we have access to a data-generating process D that provides a pair of inputs and outputs (x,y), thus providing labeled data. In this section, we assume that we no longer have access to labeled data. We have access to a set of datapoints X={xi}i=1m and want to make sense of them. More formally, unsupervised learning tasks often correspond to minimizing a loss function of the form

L(f)=1mi=1m(f(xi),xi),

where {xi;i=1,,m} is the unlabeled dataset, and is a nonnegative map measuring the error at one datapoint. This is similar to the empirical risk minimization framework in supervised learning, that is, (1.3) with xi in place ...

Get The Mathematics of Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.