9Linear Regression

9.1 Basics and Simple Linear Regression

The idea behind simple linear regression is based on a systematic relationship between two continuous variables, where the value of one variable (the response) depends on the value of the other variable (the predictor or explanatory variable). This is the key difference to correlation analysis treated in Chapter 8, where the two continuous variables are interchangeable. Just like in correlation analysis, the relationship can be positive (the larger the predictor value, the larger the response value) or negative (the larger the predictor value, the lower the response value). The linear relationship is determined by two parameters:

  • the intercept (β0) – the value of the response variable (y) when the explanatory variable (x) is zero.
  • the slope (β1) – the rate of change in y associated with a one‐unit change in x.

Together with the error term ε (also known as residual error or random error), representing real‐world variation, the intercept and slope form the basic linear regression model:

y Subscript i Baseline equals beta 0 plus beta 1 x Subscript i Baseline plus epsilon Subscript i Baseline epsilon tilde upper N left-parenthesis 0 comma sigma squared right-parenthesis

where the response variable is indicated by y, and x represents the explanatory variable. The subscript i denotes the running index and refers to an individual observation (that means i runs from the first to the last observation as in 1 … n). The εN(0, σ2) bit describes the assumption that the model errors (estimated by the residuals) ...

Get R-ticulate now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.