I'm fetching the category from metadata table and creating dynamic dags for each category using python script. Right now, we have around 15 categories, so each category will have its own dag. My Dag file has 3 tasks, and it is running sequentially.
Using LocalExecutor.All the 15 dags(dag-runs) triggering in parallel. We don't have enough resources(tasks are heavy) to run all the 15 dags in parallel.
Any way to prioritize the dag-runs? 5 dags should run first, then next five should run and so on. Jobs should run based on available resources, others should be in queue.This should be dynamic.
Any best way to fix this? Kindly help.
Sample dag:
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime, timedelta
default_args = {
'start_date': datetime(2019, 6, 03),
'retries': 1,
'retry_delay': timedelta(minutes=5)
}
dag = DAG('test', catchup=False, default_args=default_args, schedule_interval='*/5 * * * *')
t1 = BashOperator(
task_id='print_start_date',
bash_command='date',
dag=dag)
t2 = BashOperator(
task_id='sleep',
bash_command='sleep 50s',
retries=3,
dag=dag)
t3 = BashOperator(
task_id='print_end_date',
bash_command='date',
dag=dag)
t1 >> t2 >> t3