1
votes

I have created a dag and that dag is available in the Airflow UI and i turned it on to run it. After running the dag the status is showing it is up for retry. After that i went to the server and used the command "Airflow scheduler" and after that the dag went successful.

Before running the dag the scheduler is up and running and i am not sure why this is happened. Do we need to run the airflow scheduler when ever we create a new dag ? Want to know how the scheduler works.

Thanks

1

1 Answers

1
votes

You can look at the airflow scheduler as an infinite loop that checks tasks' states on each iteration and triggers tasks whose dependencies have been met.

The entire process generates a bunch of data that piles up more and more on each round and, at some point, it might end up rendering the scheduler useless as its performance degrades over time. This depends on your Airflow version, it seems to be solved in the newest version (2.0), but for older ones (< 2.0) the recommendation was to restart the scheduler every run_duration number of seconds, with some people recommending setting it to once an hour or once a day. So, unless you're working on Airflow 2.0, I think this is what you're experiencing.

You can find references to this scheduler-restarting issue in posts made by Astronomer here and here.