Skip to Content
Making Software
book

Making Software

by Andy Oram, Greg Wilson
October 2010
Beginner to intermediate
624 pages
24h 9m
English
O'Reilly Media, Inc.
Content preview from Making Software

A Mining Primer

Let us now describe how to retrieve actual historic data. Each software repository is different, and so are the mining steps necessary to extract the relevant data points. But most mining tools and setups share a common procedure. Here we give a step-by-step guide to mining a software repository on a real-world example: the IBM Eclipse project. Eclipse is open source software and has been a subject of many empirical software engineering research projects.

Thus, Eclipse is an ideal candidate for a hands-on example. But even though it is an open source project that exposes all the data about bugs and their fixes, the unstructured ways in which this information was collected makes it an interesting and challenging case study.

Historic data for a software project is preserved through many different activities in many different systems (e.g., version control, bug tracking systems, email messages, etc.). In order to extract and learn from the history of a software project, you have to access these resources. For many open source systems such as Eclipse, most of these resources are publicly available and can be accessed easily.

Following the steps detailed in the following sections, we will extract and link Eclipse history, process data, and bug data that can be used for various kinds of defect prediction or process analysis. Normally, the results get stored in persistent data storage systems (e.g., relational databases) that allow further analysis steps and manual inspection. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Righting Software

Righting Software

Juval Lowy
How Software Works

How Software Works

V. Anton Spraul
Design It!

Design It!

Michael Keeling

Publisher Resources

ISBN: 9780596808310Errata Page