21Reproducibility in the Era of Big Data: Lessons for Developing Robust Data Management and Data Analysis Procedures

D. Betsy McCoach1, Jennifer N. Dineen2, Sandra M. Chafouleas1, and Amy Briesch3

1The Neag School of Education, University of Connecticut, Storrs, CT, USA

2The Department of Public Policy, University of Connecticut, Storrs, CT, USA

3Bouve College of Health Sciences, Northeastern University, Storrs, CT, USA

21.1 Introduction

In recent years, options for survey data collection mode and sample sources have increased, and technological advances for data collection have blossomed (Couper 2013, 2017; Hill and Dever 2014). Current technology allows for ubiquitous, almost continuous data collection on the web, smart devices, and beyond. These technological advances have ushered in a new era in survey research methods: Big Data meets survey data. This chapter focuses on the challenges and opportunities of using structured data from administrative and publicly available data sources, such as school records; demographic data; or public health records, in conjunction with traditional survey data. We illustrate several issues associated with managing large, multifaceted, multisource datasets with our recent educational research project, which incorporated both traditional surveys and preexisting administrative data gathered from multiple sources. We close with a set of recommendations for researchers who wish to integrate Big Data and survey data.

21.2 Big Data

Over the ...

Get Big Data Meets Survey Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.