May 2024
Intermediate to advanced
210 pages
5h 55m
English
So far, we have always assumed that we have access to a data-generating process D that provides a pair of inputs and outputs , thus providing labeled data. In this section, we assume that we no longer have access to labeled data. We have access to a set of datapoints and want to make sense of them. More formally, unsupervised learning tasks often correspond to minimizing a loss function of the form
where is the unlabeled dataset, and ℓ is a nonnegative map measuring the error at one datapoint. This is similar to the empirical risk minimization framework in supervised learning, that is, (1.3) with in place ...