1
votes

I started to use Airflow to schedule jobs in our company, and I am wondering about its best practices.

Is it recommended to put all my tasks in one DAG ? If not, what is the right middle between one Dag and multiple Dags?

Our scheduled DAG's execute collects, transforms, exports and some other computing programs. So we will continuously have new tasks to add.

1

1 Answers

2
votes

Generally, one python file consists of a single DAG with multiple task. This is because it is the logical grouping of the tasks.

If you have multiple DAG that have dependencies you can use TriggerDagRunOperator at the end of DAG1. This would trigger DAG2 (separate DAG file) if all tasks in DAG1 succeeds.

An example of this is:

DAG1: https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_trigger_controller_dag.py

DAG2: https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/example_trigger_target_dag.py