18
votes

My flask app is comprised of four containers: web app, postgres, rabbitMQ and Celery. Since I have celery tasks that run periodically, I am using celery beat. I've configured my docker-compose file like this:

version: '2'
services:
  rabbit:
    # ...      
  web:
    # ...
  rabbit:
    # ...
  celery:
    build:
        context: .
        dockerfile: Dockerfile.celery

And my Dockerfile.celery looks like this:

# ...code up here...
CMD ["celery", "-A", "app.tasks.celery", "worker", "-B", "-l", "INFO"]

While I read in the docs that I shouldn't go to production with the -B option, I hastily added it anyway (and forgot about changing it) and quickly learned that my scheduled tasks were running multiple times. For those interested, if you do a ps aux | grep celery from within your celery container, you'll see multiple celery + beat processes running (but there should only be one beat process and however many worker processes). I wasn't sure from the docs why you shouldn't run -B in production but now I know.

So then I changed my Dockerfile.celery to:

# ...code up here...
CMD ["celery", "-A", "app.tasks.celery", "worker", "-l", "INFO"]
CMD ["celery", "-A", "app.tasks.celery", "beat", "-l", "INFO"]

No when I start my app, the worker processes start but beat does not. When I flip those commands around so that beat is called first, then beat starts but the worker processes do not. So my question is: how do I run celery worker + beat together in my container? I have combed through many articles/docs but I'm still unable to figure this out.

EDITED

I changed my Dockerfile.celery to the following:

ENTRYPOINT [ "/bin/sh" ]
CMD [ "./docker.celery.sh" ]    

And my docker.celery.sh file looks like this:

#!/bin/sh -ex
celery -A app.tasks.celery beat -l debug &
celery -A app.tasks.celery worker -l info &

However, I'm receiving the error celery_1 exited with code 0

Edit #2

I added the following blocking command to the end of my docker.celery.sh file and all was fixed:

tail -f /dev/null
3
you can create another container just for beat and override its worker command... when you scale your workers you will be scaling just them and not the beat (scheduler) too - Mazel Tov
@MazelTov - good suggestion and for my next project I'll consider putting these in separate containers. For various reasons, I needed both of these processes to run in the same container. - hugo

3 Answers

11
votes

docker run only one CMD, so only the first CMD get executed, the work around is to create a bash script that execute both worker and beat and use the docker CMD to execute this script

0
votes

You can use celery beatX for beat. It is allowed (and recommended) to have multiple beatX instances. They use locks to synchronize.

Cannot say if it is production-ready, but it works for me like a charm (with -B key)

0
votes

I got by putting in the entrypoint as explained above, plus I added the &> to have the output in a log file.

my entrypoint.sh

#!/bin/bash
python3 manage.py migrate

python3 manage.py migrate catalog --database=catalog

python manage.py collectstatic --clear --noinput --verbosity 0


# Start Celery Workers
celery worker --workdir /app --app dri -l info &> /log/celery.log  &

# Start Celery Beat
celery worker --workdir /app --app dri -l info --beat &> /log/celery_beat.log  &

python3 manage.py runserver 0.0.0.0:8000