APPENDIX DMore on Regression
“The plot thickens.”
—Anonymous
In Chapter 7 we discussed the use of regression analysis in building empirical models. Sometimes as we use regression analysis we encounter issues and needs beyond that discussed in Chapter 7. In this appendix we discuss five commonly encountered situations that can benefit from further elaboration. These issues and problems are:
- My regression model has a low adjusted R-squared value; say < 50%. What should I conclude? Is my model poor; or could there be problems with the data; or some other problem?
- I encounter some observations that have large standardized residuals (e.g., > 2.0). What should I do? Delete these “atypical” data points and refit the model? Or is there something else I should do? How do I handle this situation?
- In my data set there are both continuous variables and discontinuous (discrete) variables. I have variables like store size or month of year (quantitative) and region of the country like East and West (qualitative). Can I use regression analysis to build models with this kind of data? What approach should I use?
- Sometimes you are building models for processes that have more than one response of importance. A classic example is process speed (as measured by cycle time) and the quality of the product produced by the process. The obvious answer is to find the combinations of predictor variables (x's) that will maximize both process speed and product quality, or at least produce quality better ...
Get Statistical Thinking, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.