I’m new to Airflow and I’m trying to understand how to use the scheduler correctly. Basically I want to schedule tasks the same way as I use cron. There’s a task that needs to be run every 5 minutes and I want it to start at the dag run next even 5 min slot after I add the DAG file to dags directory or after I have made some changes to the dag file.
I know that the DAG is run at the end of the schedule_interval. If I add a new DAG and use start_date=days_ago(0) then I will get the unnecessary runs starting from the beginning of the day. It also feels stupid to hardcode some specific start date on the dag file i.e. start_date=datetime(2019, 9, 4, 10, 1, 0, 818988). Is my approach wrong or is there some specific reason why the start_date needs to be set?
start_date: datetime.now() - timedelta(minutes=5)or something similar? - absolutelydevastateddatetime.now()or dynamic-start-date. Quoting the relevant line here"..We recommend against using dynamic values as start_date, especially datetime.now() as it can be quite confusing. The task is triggered once the period closes, and in theory an @hourly DAG would never get to an hour after now as now() moves along..."- y2k-shubham