I am struggling to perform some really simple task with Airflow.
For context, I use docker-compose to run docker containers with Airflow and Postgres. (https://github.com/puckel/docker-airflow)
I am trying to test the integration of one of our inhouse library with Airflow. The not very clean method I use to quickly test is to docker exec into the airflow container and pip install the appropriate library (that are shared through Host machine to container with a Docker volume in read-only mode).
Everything is installed properly with pip and I can use my library when running a dummy Python script.
However when I integrate the same logic in a DAG python file, I got the error "broken dag, no module named inhouse_lib.
At first I thought that Airflow was picking dependencies in a specific pip directory relative to the Python version and that I installed the library in another pip directory.
But for all by Python binaries, they all use Python 3.7.
For all pip binaries I have (pip, pip3, pip3.7) when doing a pip list, I can find my inhouse library.
I failed to understand how I am supposed to deploy my library so that Airflow can pick them up. Any insights would appreciated.
Thanks for your help.
Edit To clarify what I trying to do, below some details. In my DAG, I want to use a custom Python library (let's call it myLib feature that is not yet implemented. Once implemented, I want to deploy this latest version of myLib into in the airflow container.
I updated the docker-compose.yml with a volume that maps my host directory with myLib on container airflow home.
# Go in the container
docker exec -it <airflow docker container ID> bash
# Install myLib to Python environment
pip install myLib
# Check the installation
pip list | grep myLib # output myLib
# Check the import in Python REPL
python
import myLib # No Python error
The same import does not work in my Airflow DAG. When checking container logs, I have the following error:
[2019-08-30 15:14:30,499] {{__init__.py:51}} INFO - Using executor LocalExecutor
[2019-08-30 15:14:30,894] {{dagbag.py:90}} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2019-08-30 15:14:30,897] {{dagbag.py:205}} ERROR - Failed to import: /usr/local/airflow/dags/mydag.py
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/models/dagbag.py", line 202, in process_file
m = imp.load_source(mod_name, filepath)
File "/usr/local/lib/python3.7/imp.py", line 171, in load_source
module = _load(spec)
File "<frozen importlib._bootstrap>", line 696, in _load
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/usr/local/airflow/dags/mydag.py", line 7, in <module>
import myLib
ModuleNotFoundError: No module named 'myLib'
[2019-08-30 15:14:31 +0000] [167] [INFO] Handling signal: ttou
[2019-08-30 15:14:31 +0000] [11446] [INFO] Worker exiting (pid: 11446)