I have a DAG which fans out to multiple independent units in parallel. This runs in AWS, so we have tasks which scale our AutoScalingGroup up to the maximum number of workers when the DAG starts, and to the minimum when the DAG completes. The simplified version looks like this:
| - - taskA - - |
| |
scaleOut - | - - taskB - - | - scaleIn
| |
| - - taskC - - |
However, some of the tasks in the parallel set fail occasionally, and I can't get the scaleDown task to run when any of the A-C tasks fail.
What's the best way to have a task execute at the end of the DAG, once all other tasks have completed (success or fail)? The depends_on_upstream setting sounded like what we needed, but didn't actually do anything based on testing.