11Mining the New Oil for Official Statistics1

Siu‐Ming Tam1,2, Jae‐Kwang Kim3, Lyndon Ang1, and Han Pham1

1Australian Bureau of Statistics, ABS House, Belconnen, ACT, 2617, Australia

2School of Mathematics, National Institute for Applied Statistical Research, Australian University of Wollongong, Keiraville, NSW, 2500, Australia

3Department of Statistics, Iowa State University, Ames, IA, 50011, USA

11.1 Introduction

At the 2006 Senior Marketers' Conference (Haupt 2016), a UK mathematician, Clive Humby, pronounced that “Data is the new oil. It's valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc. to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value.” We agree. Like oil, how the data is created initially and then how it is “refined” for statistics production will determine its public value. Indeed, in official statistics, data that are not representative of the population on which public and private decisions are made, or data that are susceptible to measurement errors, are of limited value at best. While official statisticians have been using administrative data in the production of official statistics for decades, only recently have they begun developing new ways of refining the ubiquitous Big Data, defined by Eurostat (2014) as “large amount of data produced very quickly by a high number of diverse sources,” which include the Internet of Things, sensors, ...

