I have to run spark job and in that spark job we have to pass date as an argument to read current directory. I am using Airflow to schedule job. Below are some info
start_date
import pendulum
local_tz = pendulum.timezone("Asia/Kolkata")
start_date': datetime(year=2020, month=8, day=3,tzinfo=local_tz)
schedule_interval
schedule_interval='20 0 * * *'
value to pass in job
{{ (execution_date + macros.timedelta(hours=5,minutes=30) - macros.timedelta(days=1)).strftime("%Y/%m/%d") }}
We have to run this job at midnight for the previous day but this expression giving me date for a day before yesterday. I added 5:30 because our airflow use UTC time.
Can anybody explain what is happening here with reference?
Thanks