Skip to Content
Python Machine Learning - Third Edition
book

Python Machine Learning - Third Edition

by Sebastian Raschka, Vahid Mirjalili
December 2019
Beginner to intermediate
772 pages
19h 20m
English
Packt Publishing
Content preview from Python Machine Learning - Third Edition

4

Building Good Training Datasets – Data Preprocessing

The quality of the data and the amount of useful information that it contains are key factors that determine how well a machine learning algorithm can learn. Therefore, it is absolutely critical to ensure that we examine and preprocess a dataset before we feed it to a learning algorithm. In this chapter, we will discuss the essential data preprocessing techniques that will help us to build good machine learning models.

The topics that we will cover in this chapter are as follows:

  • Removing and imputing missing values from the dataset
  • Getting categorical data into shape for machine learning algorithms
  • Selecting relevant features for the model construction

Dealing with missing data

It is ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Introduction to Machine Learning with Python

Introduction to Machine Learning with Python

Andreas C. Müller, Sarah Guido
Python Machine Learning, Second Edition - Second Edition

Python Machine Learning, Second Edition - Second Edition

Sebastian Raschka, Jared Huffman, Vahid Mirjalili, Ryan Sun

Publisher Resources

ISBN: 9781789955750Supplemental Content