9

Assessing the Adequacy of a Big Data Resource

Abstract

Before the data analyst devotes time and energy to a data resource, he or she must determine whether the data is likely to be accurate, comprehensive, representative, whether it has been organized sensibly, and whether it has been provided with adequate annotation. The first step usually involves looking at all of the data or at least looking at a large number of data samples. This chapter describes some basic approaches to examining Big Data resources.

Keywords

ASCII editor; Data assessment; Comprehensive data; Representative data; Data flattening; Full access to data

Get Principles and Practice of Big Data, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.