Chapter 13. Versioning and Migration

In earlier chapters we covered designing, building, and deploying ETL solutions and showed how to use Kettle as a single user development tool. In reality, there are usually multiple developers working on a project, which calls for means to manage the different deliverables using a version control system. Another requirement in most projects is a separation of the development, test, acceptance, and production environments. The following ETL subsystems cover these requirements:

  • Subsystem 25: Version Control System

  • Subsystem 26: Version Migration System from development to test to production

In this chapter, we discuss the various reasons behind the use of version control systems, and take a close look at a few popular open source version control systems. After that, we discuss what Kettle metadata actually is and in what formats it can be expressed. Then we explain how you can do versioning and migration with Kettle metadata.

Version Control Systems

When you are developing a data integration solution by yourself, or maybe with a small team, it's easy to keep track of what's going on. It's also fairly easy to find out what changed and who did it. However, things change drastically when the stakes go up and a data integration solution goes into production. Things change even more when you are working with a larger team or with a team that is geographically distributed. In those situations, you really want to keep your data integration services running ...

Get Pentaho® Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.