Generally speaking, there are two kinds of problems you’ll find yourself running into more often than not as a data engineer. The first stems from broken promises, aka bad upstream data sources, and the more general realm of the unknown unknowns with respect to data movement through your data pipelines. The second problem you’ll find yourself up against is time. This is not the part in the book where I start to talk to you about life, death, and decision making, but rather time as a ...
8. Workflow Orchestration with Apache Airflow
Get Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.