8.12 Regression Modeling

Regression is a method for building mathematical models of how a response variable is affected by changes in one or several predictor variables. The first step in regression is to choose the form of model that we want to use. As a general rule, we always strive to use the simplest form that is appropriate. This rule is of general importance in research and science and, as explained in Chapter 5, it is called the principle of parsimony. Scientists should always prefer the simplest possible explanation of any given phenomenon and not add any unnecessary assumptions or details. Complex explanations often indicate that there is a problem with the basic idea. The tenor of the principle is well captured in a phrase that is often credited to Albert Einstein: “Simplify as much as possible – but not more!” We will return to this discussion at the very end of this section.

Figure 8.12 A scatter plot with a linear regression line and its regression equation, as obtained in Excel.

nc08f012.eps

The simplest model that we can use is the straight line, so let us start with that. Figure 8.12 shows a scatter plot where each black diamond corresponds to one value on the x-axis and one on the y-axis. We say that the y sample is plotted versus the x sample. It seems like y has a reasonably linear relationship to x, so fitting a straight line through the data is, in this case, appropriate. ...

Get Experiment!: Planning, Implementing and Interpreting now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.