Stages of KDD

KDD can be split into stages. Seven of these are named and briefed ahead. Keep in mind that a well-conducted KDD process may require these stages to be looped once in a while:

  1. Understanding: First, you have to understand your problem. Gather prior knowledge, understand challenges, limitations, and how the problem is generally dealt with (or not). Additionally, seeking inspiration in different places is advised. It's also important to set goals from the customer's viewpoint.
  2. Data selection: In this stage, you look up data. Gather samples for training (discovery), test, and validation. How to sample and sample sizes are key decisions at this step. A well-designed and conducted sampling process can be the difference between meaningful ...

Get Hands-On Data Science with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.