these days I'm working on a new ETL project and I wanted to give a try to Airflow as job manager. Me and my colleague are both working on Airflow for the first time and we are following two different approaches: I decided to write python functions (operators like the ones included in the apache-airflow project) while my colleague uses airflow to call external python scripts through BashOperator.
I'd like to know if there is something like a "good practice", if the two approaches are equally good or I should consider one over the other.
To me, the main differences are: - with BashOperator you can call a python script using a specific python environment with specific packages - with BashOperator the tasks are more independent and can be launched manually if airflow goes mad - with BashOperator task to task communication is a bit harder to manage - with BashOperator task errors and failures are harder to manage (how can a bash task know if the task before it failed or succeded?).
What do you think?