Chapter 1. Data Understanding

In this chapter, we will cover:

Using an empty aggregate to evaluate sample size
Evaluating the need to sample from the initial data
Using CHAID stumps when interviewing an SME
Using a single cluster K-means as an alternative to anomaly detection
Using an @NULL multiple Derive to explore missing data
Creating an Outliers report to give to SMEs
Detecting potential model instability early using the Partition node and Feature Selection node

Introduction

This opening chapter is regarding data understanding, but this phase is not the first phase of CRISP-DM. Business understanding is a critical phase. Some would argue, including the authors of this book, that business understanding is the phase in most need of more attention by ...

Get IBM SPSS Modeler Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

IBM SPSS Modeler Cookbook by

Chapter 1. Data Understanding

Introduction

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly