12 Scaling Gaussian processes to large datasets
This chapter covers
- Training a GP on a large dataset
- Using mini-batch gradient descent when training a GP
- Using an advanced gradient descent technique to train a GP faster
So far, we have seen that GPs offer great modeling flexibility. In chapter 3, we learned that we can model high-level trends using the GP’s mean function as well as variability using the covariance function. A GP also provides calibrated uncertainty quantification. That is, the predictions for datapoints near observations in the training dataset have lower uncertainty than those for points far away. This flexibility sets the GP apart from other ML models that produce only point estimates, such as neural networks. However, it ...
Get Bayesian Optimization in Action now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.