How to do it…

  1. With Airflow, a pipeline specification is written in Python, which is quite convenient for us! Let's start with the import part:
from datetime import datetime, timedeltaimport osfrom os.path import isfilefrom airflow import DAGfrom airflow.operators.python_operator import PythonOperatorfrom airflow.contrib.hooks.ftp_hook import FTPHookfrom airflow.contrib.hooks.fs_hook import FSHookfrom airflow.contrib.sensors.file_sensor import FileSensor
  1. Our main object will be a Directed Acyclic Graph (DAG), which will include all the tasks and their dependencies. For now, we will specify a dictionary with some properties:
 dag_args = {    'owner': 'airflow',    'description': 'Bioinformatics with Python Cookbook pipeline', 'depends_on_past': ...

Get Bioinformatics with Python Cookbook - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.