3
votes

I have a Python/Django Project running on uwsgi/nginx. For asynchronous task we are using rabbitmq/celeryd and supervisord to manage all the daemons

Versions:

  • python: 2.7
  • django: 1.9.7
  • celery: 3.1.23
  • django-celery: 3.1.17

Celery has 10 queue of type Direct (say queue1, queue2, ...) Each queue is handled by a separate celeryd process which is manage via supervisord. each supervisord process looks as following

[program:app_queue_worker]
command=/var/www/myproj/venv/bin/celery worker -A myproj -c 2 --queue=queue1 --loglevel=INFO
directory=/var/www/myproj/
user=ubuntu
numprocs=1
autostart=true
autorestart=true
startsecs=10
exitcodes=1
stopwaitsecs = 600
killasgroup=true
priority=1000

Hence Supervisord is running 10 Mainprocess and 20 Worker process

Other Thing I have noticed is uwsgi also spawns some celery workers(Dont understand how and why, YET ) with concurrency=2. So if I have 4 uwsgi process running i will have an addition 10 celery workers running

All these workers are each taking 200-300M memory? Something is wrong here I feel it but I am not able to put my finger on it. Celery shouldn't be running such memory heavy process?

Note: Debug=False, there is no memory leakage due to debug

Can someone please comment on the architecture if it is correct or wrong?

Would it be better to run 2-3 celery MainProcesses which listen all queues at once and increase its concurrency?

Update : celery.py Config

from celery import Celery

# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'MyProject.settings')

from django.conf import settings  # noqa
from chatterbox import celery_settings

app = Celery('MyProject')

# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')

app.conf.update(
    CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend',
    CELERYD_CONCURRENCY=1,
)

app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
2
How do you measure the RAM consumption? - Krzysztof Szularz
Top and htop command - Crazyshezy
Is uwsgi starting up worker processes because there are mutliple uwsgi processes? (Should be able to limit uwsgi to 1 worker to test this) - Jesse
@Crazyshezy I mean is it VIRT, RES or SHR memory? - Krzysztof Szularz
@KrzysztofSzularz - looking at the RES value - Crazyshezy

2 Answers

0
votes

There is no simple answer to this.

To me, the fact that uwsgi spawns celery workers is wrong.

Creating only worker processes that consume all queues might lead to the situation where long running tasks make some queues overflow whereas separate workers that consume specific queues with short running tasks could make the situation better. Everything depends on your use case.

The 300mb residual memory is quite a lot. If the tasks are i/o bound go multi-thread/gevent. However, if the tasks are CPU bound, you have no other option than to scale process wise.

0
votes

If you start a celery worker with concurrency of n, it will spawn n + 1 process by default. Since you are spawning 10 workers with a concurrency of 2, celery will start 30 processes.

Each worker consumes ~60MB(~30MB for main process & 2*~15MB for subprocesses) of memory when not consuming queues. It might vary depending what your worker is doing. If you start 10 workers, it will consume ~600MB of memory.

I am not sure how you came to know that uwsgi also spawns some celery workers. Only supervisor should spawn the process.

You can run just 1 celery worker which listens to all queues with a concurrency of 20. This will reduce your memory usage at the cost of flexibility. With this setup, you can't start/stop consuming from selected queues. Also, there is no guarantee that all queues will be consumed equally.