Chapter 11. General Linear Models and Least Squares

The universe is a really big and really complicated place. All animals on Earth have a natural curiousity to explore and try to understand their environment, but we humans are privileged with the intelligence to develop scientific and statistical tools to take our curiousity to the next level. That’s why we have airplanes, MRI machines, rovers on Mars, vaccines, and, of course, books like this one.

How do we understand the universe? By developing mathematically grounded theories, and by collecting data to test and improve those theories. And this brings us to statistical models. A statistical model is a simplified mathematical representation of some aspect of the world. Some statistical models are simple (e.g., predicting that the stock market will increase over decades); others are much more sophisticated, like the Blue Brain Project that simulates brain activity with such exquisite detail that one second of simulated activity requires 40 minutes of computation time.

A key distinction of statistical models (as opposed to other mathematical models) is that they contain free parameters that are fit to data. For example, I know that the stock market will go up over time, but I don’t know by how much. Therefore, I allow the change in stock market price over time (that is, the slope) to be a free parameter whose numerical value is determined by data.

Crafting a statistical model can be difficult and requires creativity, experience, ...

Get Practical Linear Algebra for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.