34Methods for Estimating the Quality of Multisource Statistics

Arnout van Delden1, Sander Scholtus1, Ton de Waal1,2, and Irene Csorba3

1Statistics Netherlands, The Hague, The Netherlands

2Tilburg University, Tilburg, The Netherlands

3Vrije Universiteit Amsterdam, Amsterdam, The Netherlands

34.1 Introduction

With the increasing availability of data, official business statistics are more often based on multiple data sources. Evaluating the accuracy of outputs based on multiple sources has therefore become an important topic. With accuracy, we mean the bias and the variance of the output, as affected by error sources in the input data, processing of the data, or due to the estimation of the targeted output. Output accuracy of multisource statistics tends to be significantly affected by a greater variety of error sources than single source statistics. A first reason for this is that often more processing steps are involved such as linkage and standardization of the data. Another reason is that some sources that are used were originally not intended to be used for statistical purposes; we refer to this as secondary use. These first two reasons underline a greater variety of errors sources that come along with multisource statistics. In situations where sampling error or nonresponse error are not dominating (as is regularly assumed to be the case with single‐source surveys), these other error sources also need to be accounted for when estimating the output accuracy.

Quantification ...

Get Advances in Business Statistics, Methods and Data Collection now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.