When we compute a sample average, each observation in the sample has the same weight in determining the outcome. In the regression situation, this is not the case. For example, we noted in Section 2.9 that the location of observations in x space can play an important role in determining the regression coefficients (refer to Figures 2.8 and 2.9). We have also focused attention on outliers, or observations that have unusual y values. In Section 4.4 we observed that outliers are often identified by unusually large residuals and that these observations can also affect the regression results. The material in this chapter is an extension and consolidation of some of these issues.

Consider the situation illustrated in Figure 6.1. The point labeled A in this figure is remote in x space from the rest of the sample, but it lies almost on the regression line passing through the rest of the sample points. This is an example of a leverage point; that is, it has an unusual x value and may control certain model properties. Now this point does not affect the estimates of the regression coefficients, but it certainly will have a dramatic effect on the model summary statistics such as R2 and the standard errors of the regression coefficients. Now consider the point labeled A in Figure 6.2. This point has a moderately unusual x coordinate, and the y value is unusual as well. This is an influence ...

Get Introduction to Linear Regression Analysis, 5th Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.