Treating the data

What do I mean when I say let's treat the data? I learned the term from the authors of the vtreat package, Nina Zumel, and John Mount. You can read their excellent paper on the subject at this link: https://arxiv.org/pdf/1611.09477.pdf.

The definition they provide is: processor or conditioner that prepares real-world data for predictive modeling in a statistically sound manner. In treating your data, you'll rid yourself of many of the data preparation headaches discussed earlier. The example with our current dataset will provide an excellent introduction into the benefits of this method and how you can tailor it to your needs. I kind of like to think that treating your data is a smarter version of one-hot encoding.

The package ...

Get Mastering Machine Learning with R - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.