Chapter 4. An Example ETL Solution—Sakila

After the gentle introduction to ETL and Kettle provided by the first two chapters and the installation and configuration guide provided in Chapter 3, it's finally time to get a bit of hands-on experience with Kettle. Using a fairly uncomplicated yet sufficiently realistic ETL solution, this chapter will give you a quick impression of Kettle's features and capabilities. In addition, you'll get just enough experience in using the Spoon program to follow through with the more complicated examples in the remainder of this book.

This chapter does not provide a detailed step-by-step instruction on how to build this solution yourself. Instead, we provide all the necessary instructions to set up this solution on your own system using transformations and jobs that are available on this book's website at www.wiley.com/go/kettlesolutions. As we discuss the design and constructs used in the example, we will refer to specific chapters and sections in the remainder of this book that contain more detailed descriptions of a particular feature or technique used in the example.

Sakila

Our example ETL solution is based on a fairly simple star schema that can be used to analyze rentals for a fictitious DVD rental store chain called Sakila. This star schema is based on the sakila database schema, which is a freely obtainable sample database for MySQL.

Note

The sakila sample database was originally developed by Mike Hillyer who was at the time a technical writer ...

Get Pentaho® Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.