CHAPTER 13

BAYESIAN PENALIZED SPLINE MODELS FOR STATISTICAL PROCESS MONITORING OF SURVEY PARADATA QUALITY INDICATORS

JOSEPH L. SCHAFER

Center for Statistical Research and Methodology, U.S. Census Bureau

13.1 INTRODUCTION

Survey agencies are increasingly using paradata to monitor their data-collection processes. With direct visual inspection and plotting of process variables over time, it may be difficult to distinguish ordinary random fluctuations from systematic change and long-term trends. Traditional methods for statistical process monitoring (e.g., the Shewhart control chart) typically assume that that the process mean is stable over time, and these methods may be ill-suited to a variable whose mean is adrift. In this chapter, I present flexible semiparametric models for paradata series that allow the process mean to vary. The mean function is approximated by a natural spline with a penalty for roughness that is estimated from the data. I describe efficient Markov chain Monte Carlo strategies for simulating random draws of model parameters from the high-dimensional posterior distribution and produce graphical summaries for process monitoring. I illustrate these methods on monthly paradata series from the National Crime Victimization Survey.

13.1.1 Processes Under Control and Out of Control

Many survey agencies are now collecting large amounts of information that describe the data-collection process. Process data, also known as paradata (Couper, 1998), include variables such ...

Get Improving Surveys with Paradata: Analytic Uses of Process Information now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.