Pre-Model Workflow and Pre-Processing

In this chapter we will see the following recipes:

  • Creating sample data for toy analysis
  • Scaling data to the standard normal distribution
  • Creating binary features through thresholding
  • Working with categorical variables
  • Imputing missing values through various strategies
  • A linear model in the presence of outliers
  • Putting it all together with pipelines
  • Using Gaussian processes for regression
  • Using SGD for regression

Get scikit-learn Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.