14
DATA CLEANSING
It may be surprising that a book on data quality doesn't talk about “cleansing” until nearly the end of the text. Yet, while data cleansing may be the first thing on one's mind when thinking about data quality, cleansing actually encompasses only a small part of the data quality universe. The reason for this is that if we have treated data quality properly, planned from the beginning, then we should not really have to worry about ever cleaning data, since it should already be fit for use by the time the user sees it.
In actuality, though, since we have been largely operating for 40 years without considering data quality a priori, we live in a world where data have been subject to significant entropy and there is a significant ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access