One way to think about almost everything we do in data science is as dimension reduction. We are trying to learn from high-dimensional x some low-dimensional summaries that contain the information necessary to make good decisions.

Dimension reduction can be supervised or unsupervised. In supervised learning, an outside “response” variable y dictates the direction of dimension reduction. In regression, a high-dimensional x is projected through coefficients β to create the low-dimensional (univariate) summary ŷ. Chapters 24 were all about supervised learning.

In contrast, for unsupervised learning there is no response or outcome. ...

Get Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.