Chapter 6. Embedding and Representation Learning

Learning Lower-Dimensional Representations

In the previous chapter, we motivated the convolutional architecture using a simple argument.  The larger our input vector, the larger our model. Large models with lots of parameters are expressive, but they’re also increasingly data hungry. This means that without sufficiently large volumes of training data, we will likely overfit. Convolutional architectures help us cope with the curse of dimensionality by reducing the number of parameters in our models without necessarily diminishing expressiveness.

Regardless, convolutional networks still require large amounts of labeled training data. And for many problems, labeled data is scarce and expensive to generate. Our goal in this chapter will be to develop effective learning models in situations where labeled data is scarce but wild, unlabeled data is plentiful. We’ll approach this problem by learning embeddings, or low-dimensional representations, in an unsupervised fashion. Because these unsupervised models allow us to offload all of the heavy lifting of automated feature selection, we can use the generated embeddings to solve learning problems using smaller models that require less data. This process is summarized in Figure 6-1.

Figure 6-1. Using embeddings to automate feature selection in the face of scarce labeled data

In  the process ...

Get Fundamentals of Deep Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.