July 2017
Beginner to intermediate
715 pages
17h 3m
English
Simple linear regression uses a least squares approach where a line is computed that minimizes the sum of squared of the distances between the points and the line. Sometimes the line is calculated without using the Y intercept term. The regression line is an estimate. We can use the line's equation to predict other data points. This is useful when we want to predict future events based on past performance.
In the following example we use the Apache Commons SimpleRegression class with the Belgium population dataset used in Chapter 4, Data Visualization. The data is duplicated here for your convenience:
| Decade | Population |
| 1950 | 8639369 |
| 1960 | 9118700 |
| 1970 | 9637800 |
| 1980 | 9846800 |
| 1990 | 9969310 |
| 2000 | 10263618 ... |