CHAPTER 4

DATA COLLECTION

4.1 INTRODUCTION

For software engineering case studies, it is common for a large amount of raw data to be collected, and this data will then need to be refined, for example, transcribed and then coded. Further, many alternative sources of data exist which may inform the study. It is important therefore to carefully select the data sources, and to organize the raw and refined data in a structured way so that it is possible to find appropriate data, for example, during analysis. Organizing the raw data not only makes it easier to subsequently refine the data, and then to analyze it, but it also helps the researcher to assure the quality of the study by ensuring that data is not lost or overlooked through disorganization.

In Section 4.2 different types of data source are discussed. Sections 4.34.7 then review five methods of data collection commonly used in software engineering case studies: interviews, focus groups, observations, archival data, and metrics.

4.2 DIFFERENT TYPES OF DATA SOURCE

4.2.1 Classification of Data Sources

According to Lethbridge et al. [118], data collection techniques can be divided into three degrees:

First degree. These are direct methods, where the researcher is in direct contact with the interviewees and collect data in real time. This is the case with, for example, interviews [162, pp. 277–282], focus groups [162, pp. 284–289], Delphi surveys [38], and observations with “think aloud” and protocol analysis [133].

Second degree ...

Get Case Study Research in Software Engineering: Guidelines and Examples now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.