Skip to Content
Data Pipelines with Apache Airflow
book

Data Pipelines with Apache Airflow

by Julian de Ruiter, Bas Harenslak
May 2021
Beginner to intermediate
480 pages
12h 59m
English
Manning Publications
Content preview from Data Pipelines with Apache Airflow

8 Building custom components

This chapter covers

  • Making your DAGs more modular and succinct with custom components
  • Designing and implementing a custom hook
  • Designing and implementing a custom operator
  • Designing and implementing a custom sensor
  • Distributing your custom components as a basic Python library

One strong feature of Airflow is that it can be easily extended to coordinate jobs across many different types of systems. We have already seen some of this functionality in earlier chapters, where we were able to execute a job on for training a machine learning model on Amazon’s SageMaker service using the S3CopyObjectOperator, but you can (for example) also use Airflow to run jobs on an ECS (Elastic Container Service) cluster in AWS using ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Pipelines with Apache Airflow

Data Pipelines with Apache Airflow

Julian de Ruiter, Bas Harenslak
Kubernetes: Up and Running, 3rd Edition

Kubernetes: Up and Running, 3rd Edition

Brendan Burns, Joe Beda, Kelsey Hightower, Lachlan Evenson

Publisher Resources

ISBN: 9781617296901Supplemental ContentPublisher SupportOtherPublisher WebsiteSupplemental ContentErrata PagePurchase Link