4Total Error Frameworks for Found Data

Paul P. Biemerand Ashley Amaya

RTI International, Research Triangle Park, NC, USA

4.1 Introduction

The survey world is relying more heavily on “found” data for inference and decision‐making rather than survey or “designed” data. Found data are those not primarily collected for statistical purposes, but rather contain information that might be useful for inference or to gain insights about a population or phenomenon. For example, administrative data are a type of found data from systems that register persons or other entities, record transactions, and other information for later retrieval, track participants, and so on. Big Data refers to data of extreme volume, variety, and velocity, often unstructured and created from system “exhaust” with no particular purpose other than data preservation. These data become found data when they are used to achieve some analytic purpose through data mining or analysis. National statistical offices such as Statistics New Zealand (https://www.stats.govt.nz/integrated-data/integrated-data-infrastructure), Central Bureau of Statistics in The Netherlands (https://www.cbs.nl/en-gb/our-services/unique-collaboration-for-big-data-research), the Office of National Statistics in the United Kingdom (Duhaney 2017), and Statistics Canada (StatCan; Rancourt 2017) have established program areas devoted to the discovery of statistical uses of found for official statistics. For example, at StatCan, integrating administrative ...

Get Big Data Meets Survey Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.