3. Data Preparation

Overview

In this chapter, we will focus on the data preparation that has to be done before an AI project can start with model training and evaluation. You will practice ETL (Extract, Transform, and Load) or ELT (Extract, Load, and Transform), data cleaning, and any other data prep work that is commonly required by data engineers. We will cover batch jobs, streaming data ingestion, and feature engineering. By the end of this chapter, you will have knowledge and some hands-on experience of data preparation techniques.

Introduction

In the previous chapter, we discussed the layers of a data-driven system and explained the important storage requirements for each layer. The storage containers in the data layers of AI solutions ...

Get The Artificial Intelligence Infrastructure Workshop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.