6
votes

I am running Celery workers on Heroku and one of the tasks hit a time out limit. When I retried it manually everything worked fine, so it was probably a connection issue. I am using RabbitMQ as a broker and Celery is configured to do late acknowledge of the tasks (CELERY_ACKS_LATE=True). I expected the task to be returned to RabbitMQ queue and processed again by another worker, but it didn't happen. Do you I need to configure anything else for a task to return to RabbitMQ queue when worker times out?

Here are logs:

Traceback (most recent call last): 
  File "/app/.heroku/python/lib/python3.4/site-packages/billiard/pool.py", line 639, in on_hard_timeout 
    raise TimeLimitExceeded(job._timeout) 
billiard.exceptions.TimeLimitExceeded: TimeLimitExceeded(60,) 
[2015-09-02 06:22:14,504: ERROR/MainProcess] Hard time limit (60s) exceeded for simulator.tasks.run_simulations[4e269d24-87a5-4038-b5b5-bc4252c17cbb] 
[2015-09-02 06:22:18,877: INFO/MainProcess] missed heartbeat from celery@420cc07b-f5ba-4226-91c9-84a949974daa 
[2015-09-02 06:22:18,922: ERROR/MainProcess] Process 'Worker-1' pid:9 exited with 'signal 9 (SIGKILL)' 
1

1 Answers

3
votes

Looks like you're hitting Celery time limits. http://docs.celeryproject.org/en/latest/userguide/workers.html#time-limits

Celery doesn't implement retry logic for tasks by default because it doesn't know if retries are safe for your tasks. Namely, your task needs to be idempotent for retries to be safe.

Thus any retries due to task failures should be made in the task. See the example here: http://docs.celeryproject.org/en/latest/reference/celery.app.task.html#celery.app.task.Task.retry

There are a few reasons why your task could have timed out, but you'd know best. The task could have timed out because it was taking too long to process data or because it was taking too long to fetch data.

If you believe that the task is failing trying to connect to some service, I suggest decreasing the connection timeout interval and adding retry logic in your task. If your task is taking too long to process data, try splitting out your data in chunks and processing it that way. Celery has nice support for this: http://docs.celeryproject.org/en/latest/userguide/canvas.html#chunks