Earlier in this book, we used unsupervised learning to learn the underlying (hidden) structure in unlabeled data. Specifically, we performed dimensionality reduction, reducing a high-dimensional dataset to one with much fewer dimensions, and built an anomaly detection system. We also performed clustering, grouping objects together based on how similar or dissimilar they were to each other.
Now, we will move into generative unsupervised models, which involve learning a probability distribution from an original dataset and using it to make inferences about never-before-seen data. In later chapters, we will use such models to generate seemingly real data, which at times is virtually indistinguishable from the original data.
Until now, we have looked at mostly discriminative models that learn to separate observations based on what the algorithms learn from the data; these discriminative models do not learn the probability distribution from the data. Discriminative models include supervised ones such as the logistic regression and decision trees from Chapter 2 as well as clustering methods such as k-means and hierarchical clustering from Chapter 5.
Let’s start with the simplest of the generative unsupervised models known as the restricted Boltzmann machine.
Boltzmann machines were first invented in 1985 by Geoffrey Hinton (then a professor at Carnegie Mellon University and now one of the fathers ...