3 Healthcare: Diagnosing COVID-19

This chapter covers

  • Analyzing tabular data to judge which feature engineering techniques are going to help
  • Implementing feature improvement, construction, and selection techniques on tabular data
  • Using scikit-learn’s Pipeline and FeatureUnion classes to make reproducible feature engineering pipelines
  • Interpreting ML metrics in the context of our problem domain to evaluate our feature engineering pipeline

In our first case study, we will focus on the more classic feature engineering techniques that can be applied to virtually any tabular data (data in a classic row and column structure), such as value imputation, categorical data dummification, and feature selection via hypothesis testing. Tabular datasets (figure ...

Get Feature Engineering Bookcamp now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.