Chapter 2: Preparing Your Data: Introduction

Introduction

Explore the Data

Model Studio: Data Preprocessing

Demo 2.1: Exploring Source Data

Divide the Data

Honest Assessment

Address Rare Events

Model Studio

Demo 2.2: Modifying the Data Partition

Data Preparation Best Practices

Model Studio: Feature Engineering Template

Demo 2.3: Running the Feature Engineering Pipeline Template

Quiz

Introduction

Trash in—trash out! To be effective, machine learning models need to be built from well-prepared data. It is often said that 80% of the time spent in building a successful machine learning application is spent in data preparation (Dasu and Johnson 2003). Data preparation is not strictly about correctly transforming and cleaning existing data. It also ...

Get Machine Learning with SAS Viya now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.