The focus of our review was to gather quantitative evidence on the effects of the TDD pill on internal code quality (see below), external quality, productivity, and test quality. The evaluation of the TDD pill is based on data gathered from 32 clinical trials. In the first quarter of 2009, the authors gathered 325 TDD research reports from online comprehensive indices, major scientific publishers (ACM, IEEE, Elsevier), and “gray literature” (technical reports, theses). The initial set of 325 reports was narrowed down to 22 reports through a two-level screening process. Four researchers filtered out studies conducted prior to 2000, qualitative studies, surveys, and wholly subjective analyses of the TDD pill. Some of these reports contained multiple or overlapping trials (i.e., the same trial was reported in multiple papers); in such cases, the trial was counted only once. A team of five researchers then extracted key information from the reports regarding study design, study context, participants, treatments and controls, and study results. In total, the research team analyzed 22 reports containing 32 unique trials.