9Linear Regression

Art, like morality, consists of drawing the line somewhere.

Gilbert Keith Chesterton

9.1. Introduction

In the previous section, we discussed the existence of relationships or “correlations” between variables, and the task of discovering these correlations in order to measure the strength of the relationships among several variables. But in most cases, knowing that a relationship exists is not enough; to better analyze this relationship, we have to use “linear regression”.

Linear regression, which we will look at in this chapter, can be used to model the relationships between different variables. Globally, the approach on which this method aims to explain the influence of a set of variables on the results of another variable. In this case, it is called a “dependent variable” because it depends on other variables (independent variables).

Regression is an explanatory method that makes it possible to identify the input variables that have the greatest statistical influence on the result. It is a basic algorithm that any enthusiast of Machine Learning practices must master. This will provide a reliable foundation for learning and mastering other data analysis algorithms.

This is one of the most important and widely used data analysis algorithms. It is used in many applications ranging from forecasting the prices of homes, cars, etc. or the weather, to understanding gene regulation networks, determining tumor types, etc.

In this chapter, you will learn about this ...

Get Sharing Economy and Big Data Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.