Data Mining

Data mining refers to extracting useful knowledge from what may otherwise appear to be an overwhelming amount of noisy data. Data mining is often associated with elaborate, specialized methods, but we can also mine data by scaling up the way in which we use simpler statistics. The data we’re considering here consist of 600 transactions (200 each at rural, suburban, and rural locations) for 650 products that come in multiple types of packaging. That’s 390,000 data records (which seems like a lot until you realize these are a small sample of all customer transactions). To make sense of so much data, we will convert the table of counts for each product into a chi-squared statistic. With a little more work to make the results comparable ...

Get Statistics for Business: Decision Making and Analysis, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.