3
votes

So I'm running celery on multiple servers with a clustered rabbitmq backend. Recently, anything I do with celery has begun hanging indefinitely, and checking the logs for rabbitmq provides me with this obscure error message:

=ERROR REPORT==== 20-Mar-2013::23:52:25 ===
connection <0.15823.3>, channel 1 - soft error:
{amqp_error,not_found,
        "no binding i-69995906 between exchange 'i-69995906' in vhost 'celery' and queue 'i-69995906' in vhost 'celery'",
        'queue.bind'}

Running rabbitmqctl list_bindings gives me this:

# rabbitmqctl list_bindings -p celery
Listing bindings ...
        exchange    celery  queue   celery  []
celery  exchange    celery  queue   celery  []
...done.

What do I need to do to get rid of the error? I've already restarted Rabbitmq, reinstalled Rabbitmq, and deleted and restored the cluster. I'm guessing that I need to restore the preexisting binding, but I don't know how from rabbitmqctl or celery. If this doesn't work, none of my celery tasks work at all.

3

3 Answers

5
votes

I had the same problem and was able to fix it without having to shutdown the cluster or reset the virtual host.

I had a queue with 3 routing keys bound in a cluster. I had to remove the queue while 1 of the nodes was down and after that I always got the "no binding between exchange in vhost and queue' error when trying to register the routing keys again in the newly created queue with the same name.

The original queue was created as 'Durable' and the solution was to:

  • Delete the queue
  • Create a new queue with the same name but 'Transient' (non-durable)
  • Register the original 3 routing keys in the queue. It stopped raising the errors.

As I wanted to have a durable queue, I then deleted the queue again, created a new 'Durable' queue with the same name and then binding the routing keys worked perfectly.

Maybe by creating a new queue with different 'Durability' type, did reset the old bindings that were still remaining somewhere.

4
votes

Thanks for the question. I ended up in exactly the same place.

I was able to correct this issue by deleting the vhost and recreating it

rabbitmqctl delete_vhost celery
rabbitmqctl add_vhost celery
rabbitmqctl set_permissions -p celery <user> ".*" ".*" ".*"
2
votes

I also had this error and the only way to solve it was to shutdown the whole cluster at once and let it off for some seconds.

Preface: We did experience some partitions before and were not able to push to one queue and also were not able to recreate the binding and got the same error as you.

Stopping and starting one by one doesn't work. The error stays and I assume that some node of the cluster, did cache something faulty ip/config.

Detection: A good way to determine if this is you error is running rabbitmqctl list_queues on all nodes. If the nodes show different queues the something went wrong.

Solution: As stated the solution was to stop all rabbit servers for some seconds so there is no "cache". Of course this is only a solution of you aren't dependent on permanent queues.