Throughout this book, you will find examples of how to gather evidence—evidence on the effectiveness of testing, the quality of bug reports, the role of complexity metrics, and so on. But do these findings actually apply to your project? The definite way to find this out is to repeat the appropriate study on your data, in your environment. This way, you will not only gather lots of insight into your own project; you will also experience the joys of experimental research. Unfortunately, you may also encounter the downside: empirical studies can be very expensive, in particular if they involve experiments with developers.
Fortunately, there is a relatively inexpensive way to gather lots of evidence about your project. Software archives, such as version or bug repositories, record much of the activity around your product, in terms of problems occurring, changes made, and problems fixed. By mining these archives automatically, you can obtain lots of initial evidence about your product—evidence that already is worthy in itself, but which may also pave the path toward further experiments and further insights. In this chapter, we give a hands-on tutorial into mining software archives, covering both the basic technical steps and possible pitfalls that you may encounter on the way.