O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Outliers in the regression

Note that preceding each plot also identifies extreme data points in the dataset which may have an influence upon the regression line (line drawn through the data). The extreme values are represented in bold within the plot itself.

For example,in all of the preceding plots, you can see that observation 26 is considered an outlier with respect to the other covariates.

Print this observation to the console along with some of its neighbors. You can see how this could be considered an outlier just by inspection, since two variables (Duration and Treatment) immediately pop out as being different from the surrounding observations:

df[c(26,27,28,25),]

This is the output which is sent to the console:

 Treatment Gender ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required