3 Scheduling in Airflow
This chapter covers
- Running DAGs at regular intervals
- Constructing dynamic DAGs to process data incrementally
- Loading and reprocessing past data sets using backfilling
- Applying best practices for reliable tasks
In the previous chapter, we explored Airflow’s UI and showed you how to define a basic Airflow DAG and run it every day by defining a scheduled interval. In this chapter, we will dive a bit deeper into the concept of scheduling in Airflow and explore how this allows you to process data incrementally at regular intervals. First, we’ll introduce a small use case focused on analyzing user events from our website and explore how we can build a DAG to analyze these events at regular intervals. Next, we’ll explore ...
Get Data Pipelines with Apache Airflow now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.