© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2023
A. Ye, Z. WangModern Deep Learning for Tabular Datahttps://doi.org/10.1007/978-1-4842-8692-0_2

2. Data Preparation and Engineering

Andre Ye1   and Zian Wang2
(1)
Seattle, WA, USA
(2)
Redmond, WA, USA
 

The goal is to turn data into information, and information into insight.

—Carly Fiorina, Former CEO of Hewlett-Packard

We define data preparation, or data preprocessing, as a transformation or set of transformations applied to raw data collected directly from the data source to better represent information. We do this for the purpose of being able to better model it (Figure 2-1).

A flow diagram is depicted as follows, raw data leads to preprocessed data which further ...

Get Modern Deep Learning for Tabular Data: Novel Approaches to Common Modeling Problems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.