One way to think about almost everything we do in data science is as dimension reduction. We are trying to learn from high-dimensional x some low-dimensional summaries that contain the information necessary to make good decisions.
Dimension reduction can be supervised or unsupervised. In supervised learning, an outside “response” variable y dictates the direction of dimension reduction. In regression, a high-dimensional x is projected through coefficients β to create the low-dimensional (univariate) summary ŷ. Chapters 2–4 were all about supervised learning.
In contrast, for unsupervised learning there is no response or outcome. ...