17Using Big Data to Improve Sample Efficiency

Jamie Ridenhour, Joe McMichael, Karol Krotki, and Howard Speizer

RTI International, Research Triangle Park, NC, USA

17.1 Introduction and Background

Lists can provide good means of identifying members of a target population, but list frames are often incomplete in their coverage. To increase a frame's coverage, we may append or link additional data. Doing so, however, may decrease the efficiency of the sample by decreasing the eligibility rate, that is, the number of members of the target population for a fixed amount of sample. Improving the eligibility rate in a survey saves money because a smaller overall sample can be drawn to find a desired number of members of the target population. In the example we present here, our main purpose for using Big Data was to improve coverage (i.e. how much of our target population is on our frame) compared to an initial list frame. We then used data from the combined frame to model the likelihood that a given household would be eligible for the survey.

The National Recreational Boating Safety Survey (NRBSS), which RTI conducts on behalf of the United States Coast Guard, exemplifies our approach to increasing both coverage and efficiency. The NRBSS collects information via web or paper survey from boat‐owning households on the number of registered and unregistered boats in the household along with additional information on boat trips within the prior month1 to assess boat safety exposure risk. ...

Get Big Data Meets Survey Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.