6.1. What Is Data Mining?

Data mining is the practice of analyzing sets of data, while connected to the database server or file where the original data resides, to discover anomalies and patterns. This is best understood when thinking about large data sets of thousands or millions of rows, where it's clear a person cannot scan through all the data to find problems or trends.

We've all seen massive Excel spreadsheets where we could seemingly scroll through rows and columns forever. And turning the data into PivotTables and charts provides the first real visualization of what the data represents. But what if there are many different characteristics of the data, not all of which can be represented in graphical form?

The answer is data mining, which ...

Get Microsoft ® Office 2007 Business Intelligence now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.