Cleansing Your Data

Every set of data contains some errors. Detecting and removing these oversights, known as data cleansing, can often be a lengthy process. However, efficient data cleansing is essential in order to be able to come to accurate conclusions from data analysis. In addition, one of the principles of the Data Protection Act (described in Chapter 8) is to ensure that your data is accurate and, when necessary, is up to date.


Data cleansing comprises three main steps:

  1. Detecting the errors
  2. Selecting and applying the most appropriate methods to correct the errors
  3. If possible, preventing the errors from happening again

The process of data cleansing is usually open-ended, as some errors are hard to find and eliminate. ...

Get Databases for Small Business: Essentials of Database Management, Data Analysis,and Staff Training for Entrepreneurs and Professionals now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.