Chapter ElevenModel Data with dbt

dbt is a lightweight but powerful open source tool built around your SQL files. Consult their online docs on how to install and get started with it. Don't worry, we're not selling you anything—the tool is free and open source.

Now, consider that you could build staging schemas without a modeling tool, but it would not be efficient for all the reasons we described earlier. dbt wraps SQL with bookkeeping and a templating language called Jinja precisely to enable teams to write all the transformations they could dream up and manage them at scale.

We'll dive into some of the central features packaged in dbt.

Version Control

Modeling data is an ongoing process. We need a way to update how the modeling code is working and make it easy to see how it was done in the past. Even better would be the ability to revert changes in case a bug is introduced. Version control provides just this.

While version control is ubiquitous in software engineering, this is still a new concept for the analytics world. We will not provide an in‐depth tutorial here but would recommend checking out the following resources:

Modularity and Reusability

dbt enables organizations to easily define their data in schemas. If something about business logic changes, dbt files can be updated, and, after their changes, will propagate to models and users downstream. ...

Get The Informed Company now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.