12

Automate Data Cleaning with User-Defined Functions, Classes, and Pipelines

There are a number of great reasons to write code that is reusable. When we step back from the particular data-cleaning problem at hand and consider its relationship to very similar problems, we can actually improve our understanding of the key issues involved. We are also more likely to address a task systematically when we set our sights more on solving it for the long term than on the before-lunch solution. This has the additional benefit of helping us to disentangle the substantive issues from the mechanics of data manipulation.

We will create several modules to accomplish routine data-cleaning tasks in this chapter. The functions and classes in these modules are ...

Get Python Data Cleaning Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.