May 2024
Intermediate to advanced
486 pages
11h 33m
English
Our data cleaning efforts are often intended to prepare that data for use with a machine learning algorithm. Machine learning algorithms typically require some form of encoding of variables. Our models also often perform better with some form of scaling so that features with higher variability do not overwhelm the optimization. We show examples of that in this chapter and of how standardizing addresses the issue.
Machine learning algorithms typically require some form of encoding of variables. We almost always need to encode our features for algorithms to understand them correctly. For example, most algorithms cannot make sense of the values female or male, or know not to treat zip codes as ordinal. ...
Read now
Unlock full access