Celery is failing on one of my dotcloud deployments, and I'm not sure how to fix. The deployment is almost identical to an existing dotcloud deployment (verified via doing a file diff) which seems to be working ok.
The error I get in djcelery log:
dotcloud@hack-default-www-0:/var/log/supervisor$ more djcelery_error.log /home/dotcloud/env/lib/python2.6/site-packages/django/conf/__init__.py:75: Depre cationWarning: The ADMIN_MEDIA_PREFIX setting has been removed; use STATIC_URL i nstead. "use STATIC_URL instead.", DeprecationWarning) /home/dotcloud/env/lib/python2.6/site-packages/djcelery/loaders.py:108: UserWarn ing: Using settings.DEBUG leads to a memory leak, never use this setting in prod uction environments! warnings.warn("Using settings.DEBUG leads to a memory leak, never " [2012-06-04 03:27:32,139: WARNING/MainProcess] -------------- celery@hack-defaul t-www-0 v2.5.3 ---- **** ----- --- * *** * -- [Configuration] -- * - **** --- . broker: amqp://[email protected]:29210// - ** ---------- . loader: djcelery.loaders.DjangoLoader - ** ---------- . logfile: [stderr]@INFO - ** ---------- . concurrency: 2 - ** ---------- . events: ON - *** --- * --- . beat: OFF -- ******* ---- --- ***** ----- [Queues] -------------- . celery: exchange:celery (direct) binding:celery [Tasks] . experiments.tasks.pushMessageToIphone . experiments.tasks.sendTestMessage [2012-06-04 03:27:32,172: INFO/PoolWorker-1] child process calling self.run() [2012-06-04 03:27:32,185: INFO/PoolWorker-2] child process calling self.run() [2012-06-04 03:27:32,188: WARNING/MainProcess] celery@hack-default-www-0 has sta rted. [2012-06-04 03:27:35,315: ERROR/MainProcess] Consumer: Connection Error: Socket closed. Trying again in 2 seconds... [2012-06-04 03:27:40,374: ERROR/MainProcess] Consumer: Connection Error: Socket closed. Trying again in 4 seconds... [2012-06-04 03:27:47,479: ERROR/MainProcess] Consumer: Connection Error: Socket closed. Trying again in 6 seconds... [2012-06-04 03:27:56,509: ERROR/MainProcess] Consumer: Connection Error: Socket
Interestingly, the error log of celery cam shows something a bit different. I'm not sure if this is a red herring..
/home/dotcloud/env/lib/python2.6/site-packages/django/conf/__init__.py:75: Depre cationWarning: The ADMIN_MEDIA_PREFIX setting has been removed; use STATIC_URL i nstead. "use STATIC_URL instead.", DeprecationWarning) [2012-06-04 03:27:31,373: INFO/MainProcess] -> evcam: Taking snapshots with djce lery.snapshot.Camera (every 1.0 secs.) Traceback (most recent call last): File "hack/manage.py", line 14, in execute_manager(settings) File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/__ init__.py", line 459, in execute_manager utility.execute() File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/__ init__.py", line 382, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/home/dotcloud/env/lib/python2.6/site-packages/djcelery/management/base. py", line 74, in run_from_argv return super(CeleryCommand, self).run_from_argv(argv) File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/ba se.py", line 196, in run_from_argv self.execute(*args, **options.__dict__) File "/home/dotcloud/env/lib/python2.6/site-packages/djcelery/management/base. py", line 67, in execute super(CeleryCommand, self).execute(*args, **options) File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/ba se.py", line 232, in execute output = self.handle(*args, **options) File "/home/dotcloud/env/lib/python2.6/site-packages/djcelery/management/comma nds/celerycam.py", line 26, in handle ev.run(*args, **options) File "/home/dotcloud/env/lib/python2.6/site-packages/celery/bin/celeryev.py", line 38, in run detach=detach) File "/home/dotcloud/env/lib/python2.6/site-packages/celery/bin/celeryev.py", line 70, in run_evcam return cam() File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/snapshot.py ", line 116, in evcam recv.capture(limit=None) File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/__init__.py ", line 204, in capture list(self.itercapture(limit=limit, timeout=timeout, wakeup=wakeup)) File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/__init__.py ", line 193, in itercapture with self.consumer(wakeup=wakeup) as consumer: File "/usr/lib/python2.6/contextlib.py", line 16, in __enter__ return self.gen.next() File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/__init__.py ", line 185, in consumer queues=[self.queue], no_ack=True) File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/messaging.py", line 279, in __init__ self.revive(self.channel) File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/messaging.py", line 286, in revive channel = channel.default_channel File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/connection.py", lin e 581, in default_channel self.connection File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/connection.py", lin e 574, in connection self._connection = self._establish_connection() File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/connection.py", lin e 533, in _establish_connection conn = self.transport.establish_connection() File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/transport/amqplib.p y", line 279, in establish_connection connect_timeout=conninfo.connect_timeout) File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/transport/amqplib.p y", line 89, in __init__ super(Connection, self).__init__(*args, **kwargs) File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/connec tion.py", line 144, in __init__ (10, 30), # tune File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/abstra ct_channel.py", line 95, in wait self.channel_id, allowed_methods) File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/connec tion.py", line 202, in _wait_method self.method_reader.read_method() File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/method _framing.py", line 221, in read_method raise m IOError: Socket closed
My supervisord file:
[program:djcelery] directory = /home/dotcloud/current/ command = /home/dotcloud/env/bin/python hack/manage.py celeryd -E -l info -c 2 stderr_logfile = /var/log/supervisor/%(program_name)s_error.log stdout_logfile = /var/log/supervisor/%(program_name)s.log [program:celerycam] directory = /home/dotcloud/current/ command = /home/dotcloud/env/bin/python hack/manage.py celerycam stderr_logfile = /var/log/supervisor/%(program_name)s_error.log stdout_logfile = /var/log/supervisor/%(program_name)s.log
As mentioned, I have nearly identical code deployed under a different dotcloud account that is working fine.
Status of the rabbitmq broker:
$ ./dotcloud info hack.broker aliases: - hackxxxx.dotcloud.com config: password: xxxx rabbitmq_management: true user: root created_at: 1338702527.075196 datacenter: Amazon-us-east-1c image_version: 924a079b622a (latest) memory: 49M/512M (9%) ports: - name: ssh url: ssh://[email protected]:29209 - name: amqp url: amqp://root:[email protected]:29210 - name: http url: http://root:[email protected]/ state: running type: rabbitmq