Chapter 7. Probabilistic Machine Learning with Generative Ensembles

Don’t look for the needle in the haystack. Just buy the haystack!

— John Bogle, inventor of the index fund and founder of the Vanguard Group

Most of us probably didn’t know we were learning one of the most powerful and robust ML algorithms in high school when we were finding the line of best fit to a scatter of data points. The ordinary least squares (OLS) algorithm that is used to estimate the parameters of linear regression models was developed by Adrien-Marie Legendre and Carl Gauss more than two hundred years ago. These types of models have the longest history and are viewed as the baseline machine learning models in general. Linear regression and classification models are considered to be the most basic artificial neural networks. It is for these reasons that linear models are considered to be the “mother of all parametric models.”

Linear regression models play a pivotal role in modern financial practice, academia, and research. The two foundational models of financial theory are linear regression models: the capital asset pricing model (CAPM) is a simple linear regression model; and the model of arbitrage pricing theory (APT) is a multiple regression model. Factor models used extensively by investment managers are just multiple regression models with public and proprietary factors. A factor is a financial feature such as the inflation rate. Linear models are also the model of choice for many high-frequency ...

Get Probabilistic Machine Learning for Finance and Investing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.