Skip to Content
Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits
book

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits

by Tarek Amr
July 2020
Intermediate to advanced
384 pages
8h 38m
English
Packt Publishing
Content preview from Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits
Preparing Your Data

In the previous chapter, we dealt with clean data, where all the values were available to us, all the columns had numeric values, and when faced with too many features, we had a regularization technique on our side. In real life, it will often be the case that the data is not as clean as you would like it to be. Sometimes, even clean data can still be preprocessed in ways to make things easier for our machine learning algorithm. In this chapter, we will learn about the following data preprocessing techniques:

  • Imputing missing values
  • Encoding non-numerical columns
  • Changing the data distribution
  • Reducing the number of features via selection
  • Projecting data into new dimensions

Imputing missing values

"It is a capital mistake ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Interpretable Machine Learning with Python

Interpretable Machine Learning with Python

Serg Masís

Publisher Resources

ISBN: 9781838826048Supplemental Content