0
votes

I have a DAG which takes a very long time to do a bigquery operation. And always i get the error 'Broken DAG: [/home/airflow/gcs/dags/xyz.py] Timeout' I found some answers saying that we have to increase the timeout in airflow.cfg. But that idea is not suitable in my project. Is it possible to somehow increase the timeout for a particular DAG? Anybody please help. Thank you.

1
That error message hints that the parsing and execution of your Dag-definition file is taking too long like this case. I recon the relevant setting in airflow.cfg is dagbag_import_timeout which defaults to 30 seconds - y2k-shubham

1 Answers

2
votes

Yes you can set dagrun_timeout parameter on the Dag.

Specify how long a DagRun should be up before timing out / failing, so that new DagRuns can be created. The timeout is only enforced for scheduled DagRuns, and only once the # of active DagRuns == max_active_runs.

We also have a parameter execution_timeout on each Task that you can set.

execution_timeout: max time allowed for the execution of this task instance, if it goes beyond it will raise and fail. :type execution_timeout: datetime.timedelta

So if one of the task is running a query on BigQuery you can use something like

BigQueryOperator(sql=sql,
    destination_dataset_table={{ params.t_name }}),
    task_id='bq_query',
    bigquery_conn_id='my_bq_connection',
    use_legacy_sql=False,
    write_disposition='WRITE_TRUNCATE',
    create_disposition='CREATE_IF_NEEDED',
    query_params={'t_name': table_name},
    execution_timeout=datetime.timedelta(minutes=10)
    dag=dag)