We have a huge DAG, with many small and fast tasks and a few big and time consuming tasks.
We want to run just a part of the DAG, and the easiest way that we found is to not add the task that we don't want to run. The problem is that our DAG has many co-dependencies, so it became a real challenge to not broke the dag when we want to skip some tasks.
Its there a way to add a status to the task by default? (for every run), something like:
# get the skip list from a env variable
task_list = models.Variable.get('list_of_tasks_to_skip')
dag.skip(task_list)
or
for task in task_list:
task.status = 'success'
ShortCircuitOperatororBranchPythonOperatorcoupled with aTriggerDagRunOperator. The problem that persists with this approach is that if you have still more parts that are supposed to run after these small top-level DAGs, then you'll need something likeExternalTaskSensorto await completion of these small DAGs for triggering them. Untidy - y2k-shubhamenv-var(dynamically), we din't find a way to skip tasks in airflow, but we realized that is possible to create a dag based on anenv-var. All our task where basically the same, so we create them in a loop based on a list of task saved in a env-var. then, when we want to skip´some we modify that var with an graph algorithm. hope it helps. - Pablo