Current system arch around RabbitMQ
We have a cluster of queues and exchanges to support a message.
- Main exchange: This is where messages are received from message producers for processing the first time. This can be a topic or fanout (Current issue is about the fanouts)
- Main Queue(s): This is the queue from which the consumer picks the message for processing.
- Dead exchange and queue: Simple basic setup for the bad messages.
- Delay queue: This is the queue which gets the messages from the consumer if needs to be re-tried. Messages in this queue have a specific ttl and the dead exchange for this queue is “Main exchange”. Nobody listen to this queue, message just sit to expire and moved to “Main exchange”.
- Error Queue: The messages which unable to process even after retry goes here. Again currently nobody listen to them.
Problem Scenario with above setup:
Say I have a fanout “foo-exchange” which sends messages to queues “bar” and “baz”. Let’s say a message comes in, and it’s valid, and “bar” processes it successfully, but “baz” fails for some reason (maybe an external service is down) and we want to retry it after 5 minutes. The message from “baz” is sent to back to “foo-exchange” (via the delay queue) which then not only sends it back to “baz”, but “bar” as well.
Current implemented solution (we want something better than this!)
We have an exchange per queue to send the retried message back to a specific queue after the retry timeout period.
In this scenario, we have 3 exchanges (“foo-exchange”, “foo-exchange-dead”, “Baz-exchange-retry (1 per queue to fanout exchange)” and 3 queues (“baz-queue”, “baz-queue-delay”, “baz-queue-error” and 1 dead queue for whole exchange (“foo-queue-dead).
This setup is for 1 queue to fanout and will increase considerably which a fanout exchange have multiple consumer queues.
So, we need a solution which can decrease this complex setup into some manageable queues and exchanges
Things we already looked into:
- x-delay-exchange: This is not a good solution for us as this doesn’t tell how many messages are waiting to process again. We need to know how many to retry in case of external failure. (https://github.com/rabbitmq/rabbitmq-delayed-message-exchange)
- Message ttl to main queue: This blocks all messages behind the delayed message.