31Variance Estimation Under Nearest Neighbor Ratio Hot Deck Imputation for Multinomial Data: Two Approaches Applied to the Service Annual Survey (SAS)

Rebecca Andridge1, Jae Kwang Kim2, and Katherine J. Thompson3

1Division of Biostatistics, The Ohio State University, Columbus, OH, USA

2Department of Statistics, Iowa State University, Ames, IA, USA

3Associate Directorate of Economic Programs, U.S. Census Bureau, Washington, DC, USA

31.1 Introduction

Sample surveys are often designed to estimate totals (e.g. revenue, earnings). In addition, many surveys request sets of compositional variables (details) that sum to a total, such as a breakdown of total expenditures by type of expenditure or a breakdown of total income by source. The detail proportions can vary greatly by sample unit, and their multinomial distributions may be related to a different set of predictors than those associated with the total. All survey participants are asked to provide values for the total items (hereafter referred to as “totals”), whereas the type of requested details can vary. This creates two separate but related missing data challenges: (i) to develop viable imputation models for the total; and (ii) to develop viable imputation models for the set of associated detail items.

For business – or many establishment – surveys, the primary missing data challenge is the treatment of the details. With totals, reliable auxiliary data from the same unit are often available for direct substitution (e.g. administrative ...

Get Advances in Business Statistics, Methods and Data Collection now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.