Bayesian averaging

So far, we have learned that simply minimizing the loss function (or equivalently maximizing the log likelihood function in the case of normal distribution) is not enough to develop a machine learning model for a given problem. One has to worry about models overfitting the training data, which will result in larger prediction errors on new datasets. The main advantage of Bayesian methods is that one can, in principle, get away from this problem, without using explicit regularization and different datasets for training and validation. This is called Bayesian model averaging and will be discussed here. This is one of the answers to our main question of the chapter, why Bayesian inference for machine learning?

For this, let's do ...

Get Learning Bayesian Models with R now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.