Appendix D

Sources and Citations

Data Sources

Chapter 2. A Day in Your Life as a Data Miner

Property ownership data

The property ownership datasets used in this chapter were provided by Matt Schumwinger of Big Lake Data (see his profile in Chapter 2) for this case study. These datasets are not publicly available.

Similar data is available for download from the City of Milwaukee, Wisconsin at

Chapter 12. Getting Familiar with Your Data

Property ownership data: see Chapter 2.

Cigarette data

This data was collected for research described in “Determination of tar, nicotine, and carbon monoxide yields in the mainstream smoke of selected international cigarettes,” Tobacco Control 2004;13:45-51 doi:10.1136/tc.2003.003673, A M Calafat, G M Polzin, J Saylor, P Richter, D L Ashley, C H Watson. (Full text here:

The data is available through the journal Tobacco Control at

Chapter 13. Dealing in Graphic Detail

Property ownership data: see Chapter 2.

Auto mileage data

This data was collected for research described in Quinlan, R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann.

This data was obtained from the UCI Machine Learning Repository.

Bache, K. & Lichman, ...

Get Data Mining For Dummies now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.