0
votes

I am trying to understand what is going on with Airflow. I have setup my DAG with catchup=False parameter. However, when I enable it, it just runs the jobs anyways from start_date. I also tried airflow.cfg to be set to False but there was no luck with it there as well.

Also, I have been trying to understand how clear and backfill works. It looks like clear will just set states to NULL if current date is before start_date but will clear states AND trigger the dag for the date if it is after start_date. I want clear for ANY date to simply set states for that day's pipeline to None, just like it does for before start_date.

Is it a known airflow bug? My use case is to be able to clear any date range, and then run backfill on the same date range.

1
It would help to provide your Airflow version as well as providing your DAG code.brki

1 Answers

0
votes

For your question about clear: I'd say it's a feature, not a bug. From an operations perspective, it's quite handy.

When a DAG has run, and one of its tasks has failed, perhaps there's something that can be corrected that will allow that task (and dependant tasks) to succeed. So if you go and correct that thing (e.g. missing connection definition, firewall problem, etc.), you can then clear that task (as well as downstream tasks), and then the scheduler automatically puts the DAG into the running state and restarts the cleared tasks.