2
votes

I am trying to manage airflow dags (create, execute etc.) via java backend. Currently after creating a dag and placing it in dags folder of airflow my backend is constantly trying to run the dag. But it can't run it until its picked up by airflow scheduler, which can take quite some time if the number of dags are more. I am wondering if there any events that airflow emits which I can tap to check for new dags processed by scheduler, and then trigger, execute command from my backend. Or is there a way or configuration where airflow will automatically start a dag once it processes it rather than we triggering it ?

1

1 Answers

2
votes

is there a way or configuration where airflow will automatically start a dag once it processes it rather than we triggering it ?

Yes, one of the parameters that you can define is is_paused_upon_creation.

If you set your DAG as:

DAG(
    dag_id='tutorial',
    default_args=default_args,
    description='A simple tutorial DAG',
    schedule_interval="@daily",
    start_date=datetime(2020, 12, 28),
    is_paused_upon_creation=False
)

The DAG will start as soon as picked up by the scheduler (assuming conditions to run it are met)

I am wondering if there any events that airflow emits which I can tap to check for new dags processed by scheduler

In Airflow >=2.0.0 you can use the API - list dags endpoint to get all dags that are in the dagbag

In any Airflow version you can use this code to list the dag_ids:

from airflow.models import DagBag
print(DagBag().dag_ids())