3
votes

Question

After running tasks via celery's periodic task scheduler, beat, why do I have so many unconsumed queues remaining in RabbitMQ?

Setup

  • Django web app running on Heroku
  • Tasks scheduled via celery beat
  • Tasks run via celery worker
  • Message broker is RabbitMQ from ClouldAMQP

Procfile

web: gunicorn --workers=2 --worker-class=gevent --bind=0.0.0.0:$PORT project_name.wsgi:application
scheduler: python manage.py celery worker --loglevel=ERROR -B -E --maxtasksperchild=1000
worker: python manage.py celery worker -E --maxtasksperchild=1000 --loglevel=ERROR

settings.py

CELERYBEAT_SCHEDULE = {
    'do_some_task': {
        'task': 'project_name.apps.appname.tasks.some_task',
        'schedule': datetime.timedelta(seconds=60 * 15),
        'args': ''
    },
}

tasks.py

@celery.task
def some_task()
    # Get some data from external resources
    # Save that data to the database
    # No return value specified

Result

Every time the task runs, I get (via the RabbitMQ web interface):

  • An additional message in the "Ready" state under my "Queued Messages"
  • An additional queue with a single message in the "ready" state
    • This queue has no listed consumers
2

2 Answers

4
votes

It ended up being my setting for CELERY_RESULT_BACKEND.

Previously, it was:

CELERY_RESULT_BACKEND = 'amqp'

I no longer had unconsumed messages / queues in RabbitMQ after I changed it to:

CELERY_RESULT_BACKEND = 'database'

What was happening, it would appear, is that after a task was executed, celery was sending info about that task back via rabbitmq, but, there was nothing setup to consume these responses messages, hence a bunch of unread ones ending up in the queue.

NOTE: This means that celery would be adding database entries recording the outcomes of tasks. To keep my database from getting loaded up with useless messages, I added:

# Delete result records ("tombstones") from database after 4 hours
# http://docs.celeryproject.org/en/latest/configuration.html#celery-task-result-expires
CELERY_TASK_RESULT_EXPIRES = 14400

Relevant parts from Settings.py

########## CELERY CONFIGURATION
import djcelery
# https://github.com/celery/django-celery/
djcelery.setup_loader()

INSTALLED_APPS = INSTALLED_APPS + (
    'djcelery',
)

# Compress all the messages using gzip
# http://celery.readthedocs.org/en/latest/userguide/calling.html#compression
CELERY_MESSAGE_COMPRESSION = 'gzip'

# See: http://docs.celeryproject.org/en/latest/configuration.html#broker-transport
BROKER_TRANSPORT = 'amqplib'

# Set this number to the amount of allowed concurrent connections on your AMQP
# provider, divided by the amount of active workers you have.
#
# For example, if you have the 'Little Lemur' CloudAMQP plan (their free tier),
# they allow 3 concurrent connections. So if you run a single worker, you'd
# want this number to be 3. If you had 3 workers running, you'd lower this
# number to 1, since 3 workers each maintaining one open connection = 3
# connections total.
#
# See: http://docs.celeryproject.org/en/latest/configuration.html#broker-pool-limit
BROKER_POOL_LIMIT = 3

# See: http://docs.celeryproject.org/en/latest/configuration.html#broker-connection-max-retries
BROKER_CONNECTION_MAX_RETRIES = 0

# See: http://docs.celeryproject.org/en/latest/configuration.html#broker-url
BROKER_URL = os.environ.get('CLOUDAMQP_URL')

# Previously, had this set to 'amqp', this resulted in many read / unconsumed
# queues and messages in RabbitMQ
# See: http://docs.celeryproject.org/en/latest/configuration.html#celery-result-backend
CELERY_RESULT_BACKEND = 'database'

# Delete result records ("tombstones") from database after 4 hours
# http://docs.celeryproject.org/en/latest/configuration.html#celery-task-result-expires
CELERY_TASK_RESULT_EXPIRES = 14400
########## END CELERY CONFIGURATION
1
votes

Looks like you are getting back responses from your consumed tasks.

You can avoid that by doing:

@celery.task(ignore_result=True)