7
votes

I am unable to start the airflow webserver using systemd even though it starts and functions properly outside of systemd like so:

export AIRFLOW_HOME=/path/to/my/airflow/home ; airflow webserver -p 8080

The systemd log leads me to believe that the issue comes from gunicorn, even though gunicorn starts without issue when I run the above command (i.e. it's only an issue in systemd). I have configured the following systemd files according to the airflow docs (running Ubuntu 16).

/etc/default/airflow

AIRFLOW_HOME=/path/to/my/airflow/home
SCHEDULER_RUNS=5

/lib/systemd/system/airflow-webserver.service

[Unit]
Description=Airflow webserver daemon   
After=network.target

[Service]
EnvironmentFile=/etc/default/airflow
User=ubuntu
Group=ubuntu
Type=simple
ExecStart=/bin/bash -c "export AIRFLOW_HOME=/path/to/my/airflow/home ; airflow webserver -p 8080 "

Restart=on-failure
RestartSec=5s
PrivateTmp=true

[Install]
WantedBy=multi-user.target

/etc/tmpfiles.d/airflow.conf

D /run/airflow 0755 airflow airflow

This results in the following error when I start the service with systemctl.

systemctl start airflow-webserver.service

Jul 15 22:41:27 ip-172-31-19-64 systemd[1]: Started Airflow webserver daemon.
Jul 15 22:41:27 ip-172-31-19-64 bash[31494]: [2018-07-15 22:41:27,555] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python3.5/lib2to3/Grammar.txt
Jul 15 22:41:27 ip-172-31-19-64 bash[31494]: [2018-07-15 22:41:27,592] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python3.5/lib2to3/PatternGrammar.txt
Jul 15 22:41:27 ip-172-31-19-64 bash[31494]: [2018-07-15 22:41:27,729] {__init__.py:45} INFO - Using executor SequentialExecutor
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   ____________       _____________
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:  ____    |__( )_________  __/__  /________      __
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:  _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: [2018-07-15 22:41:28,042] {models.py:189} INFO - Filling up the DagBag from /path/to/my/airflow/home/dags
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: /home/ubuntu/.local/lib/python3.5/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cach
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   .format(x=modname), ExtDeprecationWarning
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Running the Gunicorn Server with:
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Workers: 4 sync
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Host: 0.0.0.0:8080
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Timeout: 120
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Logfiles: - -
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: =================================================================
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Traceback (most recent call last):
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   File "/usr/local/bin/airflow", line 27, in <module>
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:     args.func(args)
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   File "/usr/local/lib/python3.5/dist-packages/airflow/bin/cli.py", line 788, in webserver
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:     gunicorn_master_proc = subprocess.Popen(run_args)
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   File "/usr/lib/python3.5/subprocess.py", line 947, in __init__
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:     restore_signals, start_new_session)
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   File "/usr/lib/python3.5/subprocess.py", line 1551, in _execute_child
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:     raise child_exception_type(errno_num, err_msg)
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: FileNotFoundError: [Errno 2] No such file or directory: 'gunicorn'
Jul 15 22:41:28 ip-172-31-19-64 systemd[1]: airflow-webserver.service: Main process exited, code=exited, status=1/FAILURE

Is there some configuration I need to do to make gunicorn compatible with systemd?

Edit: Following suggestions that this was a permission issue, I installed gunicorn via:sudo apt-get install gunicornand upon re-running the systemctl got the following error Error: No module named airflow.www.gunicorn_config. I figured that this was due to an inconsistency between the gunicorn that I had just installed and the gunicorn my ubuntu user was using to run airflow, so I replaced the gunicorn in /usr/bin/ with the former. This hotfix is likely not the best way to make the fix, but afterwards I was successfully able to run airflow via systemd.

3
The reason is because of the usage of different python environments. Gunicorn is available for the user in which you executed the command manually and the gunicorn package is not available in the root python. Install gunicorn in the root python and start the airflow again. Post the output also.Amal G Jose
@AmalGJose success! I sudo apt-get install'd gunicorn and received a different module not found error. So I copied the gunicorn executable that my ubuntu user was running into /usr/bin/, and it worked. I'm guessing this is bad practice so I'm curious if you think there's a better solution but glad I got it working. Thanks!Ludwig von Mises
@tobi6, success! ^ additional details aboveLudwig von Mises
Yes, the approach that I suggested is just a work around. The actual solution is to configure the correct PYTHONPATH in the environment variable configuration, so that the systemd will use the same PYTHONPATH in which the airflow and the dependent packages are present. Please refer the Environment section in the document 0pointer.de/public/systemd-man/systemd.exec.html#Environment=Amal G Jose
@AmalGJose thanks I'll check it outLudwig von Mises

3 Answers

5
votes

I had the same problem on Ubuntu 18.04 LTS and Apache Airflow version 1.10.1 installed in a virtual environment under /srv/airflow. After lots of trial and errors I ended up with this working solution.

My airflow-webserver.service file:

[Unit]
Description=Airflow webserver daemon
After=network.target

[Service]
Environment="PATH=/srv/airflow/bin"
Environment="AIRFLOW_HOME=/srv/airflow"
User=airflow
Group=airflow
Type=simple
ExecStart=/srv/airflow/bin/airflow webserver --pid /srv/airflow/webserver.pid
Restart=on-failure
RestartSec=5s
PrivateTmp=true

[Install]
WantedBy=multi-user.target

I did this to install the service:

sudo cp airflow-webserver.service /lib/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable airflow-webserver.service
sudo systemctl start airflow-webserver.service
0
votes

Yeah, ExecStart=..airflow/bin/airflow requires ..airflow/bin/python but systemd found system python instead. Airflow needs Environment="PATH=..airflow/bin", always.

-1
votes

In Ubuntu Bionic, I found that sudo apt-get install python3-gunicorn first and then sudo apt-get install gunicorn while in the root python environment resolves this problem.