10Supervised Learning

In this chapter, we will discuss supervised learning. What distinguishes supervised learning problems from unsupervised learning problems is that the data come in pairs, i.e. we may say for and we would like to find a relationship between the pairs of data. We will start with linear regression. This does not mean that the data pairs are related to one another in a linear way. Instead, it is the class of functions that we consider that is parameterized in a linear way. First, we will do this in a finite‐dimensional space, and there we will also discuss statistical interpretations and generalizations such as maximum likelihood estimation, maximum a posteriori estimation, and regularization. We will then also do regression in an infinite‐dimensional space, i.e. in a Hilbert space. We will see that this is equivalent to maximum a posteriori estimation for so‐called Gaussian processes. Then we will discuss classification both using linear regression, logistic regression, support vector machines, and the restricted Boltzmann machine. The chapter is finished off with artificial neural networks and the so‐called back‐propagation algorithm. We also discuss a form of implicit regularization known as dropout.

10.1 Linear Regression

We start by considering the problem ...

Get Optimization for Learning and Control now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Optimization for Learning and Control by Anders Hansson, Martin Andersen

10Supervised Learning

10.1 Linear Regression

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly