Bayesian averaging

So far, we have learned that simply minimizing the loss function (or equivalently maximizing the log likelihood function in the case of normal distribution) is not enough to develop a machine learning model for a given problem. One has to worry about models overfitting the training data, which will result in larger prediction errors on new datasets. The main advantage of Bayesian methods is that one can, in principle, get away from this problem, without using explicit regularization and different datasets for training and validation. This is called Bayesian model averaging and will be discussed here. This is one of the answers to our main question of the chapter, why Bayesian inference for machine learning?

For this, let's do ...

Get Learning Bayesian Models with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.