July 2018
Intermediate to advanced
506 pages
16h 2m
English
Once a pipeline is up and running, there are limited options for managing the pipeline's execution. Currently, developers may cancel or drain a running job. Canceling a job causes a near immediate halt of execution, making this a good option for idempotent pipelines, where the state is not lost during pipeline ingestion and re-processed elements have no side effects. For example, a pipeline that performs a lift-and-shift from a CSV file in Cloud Storage into a BigQuery table with truncate-reload can likely be canceled mid-job and executed again at a later date.
However, canceling pipelines that consume data destructively, such as those with a PubsubIO source, will likely result in lost data. For cases like this, ...