0
votes

I connect to a cluster of rabbitmq nodes ( A, B ) using spring-amqp. Also, assume there are two message receivers ( Receiver_1 and Receiver_2 ) which were using connections to node A.

When A goes down, does Receiver_1 and Receiver_2 automatically switch their connections to connect to B. Let's say A comes up and then B in the cluster shuts down. But the receivers can't consume on A again. Why?

I debugged the spring project and I found it is not a spring consumer fault. In fact, spring does switch to server "A" but the following exception raises:

org.springframework.amqp.rabbit.listener.QueuesNotAvailableException: Cannot prepare queue for listener. Either the queue doesn't exist or the broker will not allow us to use it. at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:429) at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1022) at java.lang.Thread.run(Unknown Source) Caused by: org.springframework.amqp.rabbit.listener.BlockingQueueConsumer$DeclarationException: Failed to declare queue(s):[ha.rabbit.channel2] at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.attemptPassiveDeclarations(BlockingQueueConsumer.java:486) at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:401) ... 2 more

I checked out the log file in rabbidmq :

=ERROR REPORT==== 14-Sep-2015::03:45:41 === Channel error on connection <0.289.0> (192.168.1.150:64140 -> 192.168.1.170:5672, vhost: '/', user: 'admin'), channel 2: {amqp_error,not_found, "home node 'rabbit@vm2' of durable queue 'ha.rabbit.channel2' in vhost '/' is down or inaccessible", 'queue.declare'}

Even if I restart consumer app, while as previous config just server A is up and B is down, the same error is raised again. How can I tackle it?

1
Don't know about spring, but com.rabbitmq.client has rabbitmq.com/api-guide.html#recovery option, that should solve this problem.Suvitruf - Andrei Apanasik

1 Answers

1
votes

There is an open feature request for this.

It's not so easy because consumers, in particular, are long-lived and failing back would cause some interruption of processing (forcing the consumer to close and reconnect).

The framework doesn't know when would be a "good" time to do that.

You can programmatically call resetConnection() on the CachingConnectionFactory to force a fail back but, again, existing consumers will be affected. (resetConnection() was added in 1.5, call destroy() in earlier versions).

That said, it's not clear why such a fail-back is necessary, since the second server is likely the new master for the HA queues and it's probably better to consume from there anyway.

Since, by default, only one connection is used for all clients (in Spring AMQP), the existing failed-over connection will be used until it fails.

You could configure the connection factory to hand out a different connection to each user and set the cache size to 1, but that really defeats the whole purpose of caching.

EDIT

Another solution might be to write another connection factory that wraps 2 instances of the connection factory (each with one of the addresses configured).

Then, in the createConnection() method, you could "test" the first connection and use that if available.

This would cause "new" users (e.g. a RabbitTemplate) method to fail back, but it still doesn't solve the listener container (consumer) case; you would have to force a reset on that connection to cause them to fail back.