Appendix E
Analyzing the Regression Equation
It is a test of true theories not only to account for but to predict phenomena.
—William Whewell
■ Outliers
An outlier is an observation with a large residual—that is, a large deviation between the observed value and fitted value. Outliers reflect one of the following conditions:
- An error in collecting or manipulating the data for the given point.
- The existence of a significant extraneous causal factor that only affected the outlier(s).
- The omission of an important explanatory variable from the equation.
- A structural flaw in the model.
The presence of outliers indicates a deficiency in the model. After verifying that an outlier is not the result of error, one should try to identify possible factors responsible for the aberrant behavior. If the outlier can be explained by a missing variable that affected all observations, then this variable should be included in the equation. If, however, the outlier was a consequence of an isolated event that is not expected to reoccur, then it should be viewed as an unrepresentative point, and the regression should be rerun with the outlier deleted. This recalculation is important, since the method of least squares used to derive the regression coefficients will give greater weight to outliers. Thus, one or two such points could seriously distort the regression equation fit. However, unless the isolated causes of the outlier have been identified, one should avoid the temptation of deleting ...
Get A Complete Guide to the Futures Market, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.