O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

P-values and effect sizes

There has always been this concern regarding p-values with small samples, however using data with large samples can also yield bad results, in that the p-value may be correct, but the magnitude of the change (the effect) is very small.

Take this example in which you are measuring the effect of winning between an average of $1,000,000 or $1,000,001 in a lottery.

We will generate two probability samples:

  • X contains 1 million observations with a mean value of 1,000,000
  • Y also contains 1 million observations but with a mean value of 1,000,001 and it is only a 1-unit difference

Generate a dataframe with X and Y and print some summary statistics:

 set.seed(1020) lottery <- data.frame( cbind(x=rnorm(n=1000000,1000000,100),y=rnorm(1000000,1000001,100) ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required