Chapter 9. Generative Models

All the problems we have looked at so far involve, in some way, translating from inputs to outputs. You create a model that takes an input and produces an output. Then you train it on input samples from a dataset, optimizing it to produce the best output for each one.

Generative models are different. Instead of taking a sample as input, they produce a sample as output. You might train the model on a library of photographs of cats, and it would learn to produce new images that look like cats. Or, to give a more relevant example, you might train it on a library of known drug molecules, and it would learn to generate new “drug-like” molecules for use as candidates in a virtual screen. Formally speaking, a generative model is trained on a collection of samples that are drawn from some (possibly unknown, probably very complex) probability distribution. Its job is to produce new samples from that same probability distribution.

In this chapter, we will begin by describing the two most popular types of generative models: variational autoencoders and generative adversarial networks. We will then discuss a few applications of these models in the life sciences, and work through some code examples.

Variational Autoencoders

An autoencoder is a model that tries to make its output equal to its input. You train it on a library of samples and adjust the model parameters so that on every sample the output is as close as possible to the input.

That sounds trivial. Can’t ...

Get Deep Learning for the Life Sciences now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.