Chapter 2. Data Preparation – Select

In this chapter, we will cover:

Using the Feature Selection node creatively to remove or decapitate perfect predictors
Running a Statistics node on an anti-join to evaluate the potential missing data
Evaluating the use of sampling for speed
Removing redundant variables using correlation matrices
Selecting variables using the CHAID Modeling node
Selecting variables using the Means node
Selecting variables using single-antecedent Association Rules

Introduction

This chapter focuses on just the first task, Select, of the data preparation phase:

Decide on the data to be used for analysis. Criteria include relevance to the data mining goals, quality, and technical constraints such as limits on data volume or data types. Note ...

Get IBM SPSS Modeler Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

IBM SPSS Modeler Cookbook by

Chapter 2. Data Preparation – Select

Introduction

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly