Data Quality
Abstract
This chapter discusses data quality, which is a preliminary consideration for any commercial data analysis project; the definition of quality includes the availability or accessibility of data. The chapter examines typical problems that can occur with data, including errors in the data content (textual and numerical data) and the relevance and reliability of the data, as well as how to quantitatively evaluate data quality. Finally, some typical errors due to data extraction and how to avoid them are discussed by examining a practical case study.
Keywords
data quality
data extraction
availability
accessibility
relevance
reliability
Introduction
Data quality is a primary consideration for any commercial data analysis project, ...
Get Commercial Data Mining now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.