2
votes

I know that achieving round-robin behaviour in a topic exchange can be tricky or impossible so my question in fact is if there is anything I can make out of RabbitMQ or look away to other message queues that support that.

Here's a detailed explanation of my application requirements:

  1. There will be one producer, let's call it P
  2. There (potentially) will be thousands of consumers, let's call them Cn
  3. Each consumer can "subscribe" to 1 or more topic exchange and multiple consumers can be subscribed to the same topic
  4. Every message published into the topic should be consumed by only ONE consumer

Use case #1

Assume:

Topics

  • foo.bar
  • foo.baz

Consumers

  • Consumer C1 is subscribed to topic #
  • Consumer C2 is subscribed to topic foo.*
  • Consumer C3 is subscribed to topic *.bar

Producer P publishes the following messages:

  1. publish foo.qux: C1 and C2 can potentially consume this message but only one receives it
  2. publish foo.bar: C1, C2 and C3 can potentially consume this message but only one receives it

Note Unfortunately I can't have a separate queue for each "topic" therefore using the Direct Exchange doesn't work since the number of topic combinations can be huge (tens of thousands)

From what I've read, there is no out-of-the box solution with RabbitMQ. Does anybody know a workaround or there's another message queue solution that would support this, ex. Kafka, Kinesis etc.

Thank you

1
While I understand the question and find it theroritically very interesting (which is arguably 100% sufficient to ask it on SO), I'm also curious about the use-case for this. In my mind, topic exchanges are designed to publish messages on a "topic", and having "people" interested in some type of messages receiving it. Don't see any way it could be useful to "produce/consume" from topic exchange but I must be wrong. Personally, I don't see any other way than deferring the election of the winning consumer using some datastore. I might be wrong and I don't know much about Kafka and Kinesis.user1527491
My point is: If all (or some) consumers have to be informed that a message was sent, then use a topic exchange. If one and only one consumer has to consume the message in a producer/consumer fashion, then use a direct or a fanout exchange. If you need both, use both! Publish to the two exchanges. But this doesn't solve the problem, indeed, since you would have wrong consumers consuming.user1527491
BTW, although I'm not a Kafka expert, I'm 99% sure it wouldn't help (and would even be worse) here since it has no notion of routing at all. It's pub/sub for worse or for better, meaning it wouldn't even help for producer/consumer stuff. And notice that even RabbitMQ does not guarantee once-delivery.user1527491
@user1527491 Unfortunately I can't say much about the actual real-life use case (because of NDA and stuff) but I did my best to explain this as close as possible. I do agree that this is quite challenging (and interesting) and a funny thing to solve. Another example I could think of is: > Imagine multiple lotteries, each with one or more entrants, each entrant can be in one or more lottery but only one entrant winner per lottery.Andrei Stalbe
@user1527491 yes, after a few days of checking around seems there's no way to implement this with pure RabbitMQ or any other message queue for that matter, which is quite sad and disappointing. Moving to an event driven db now (RethinkDB) see what that has to offer.Andrei Stalbe

1 Answers

1
votes

There appears to be a conflation of the role of the exchange, which is to route messages, and the queue, which is to provide a holding place for messages waiting to be processed. Funneling messages into one or more queues is the job of the exchange, while funneling messages from the queue into multiple consumers is the job of the queue. Round robin only comes into play for the latter.

Fundamentally, a topic exchange operates by duplicating messages, one for each queue matching the topic published with the message. Therefore, any expectation of round-robin behavior would be a mistake, as it goes against the very definition of the topic exchange.

All this does is to establish that, by definition, the scenario presented in the question does not make sense. That does not mean the desired behavior is impossible, but the terms and topology may need some clarifying adjustments.

Let's take a step back and look at the described lifetime for one message: It is produced by exactly one producer and consumed by one of many consumers. Ordinarily, that is the scenario addressed by a direct exchange. The complicating factor in this is that your consumers are selective about what types of messages they will consume (or, to put it another way, your producer is not consistent about what types of messages it produces).

Ordinarily in message-oriented processing, a single message type corresponds to a single consumer type. Therefore, each different type of message would get its own corresponding queue. However, based on the description given in this question, a single message type might correspond to multiple different consumer types. One issue I have is the following statement:

Unfortunately I can't have a separate queue for each "topic"

On its face, that statement makes no sense, because what it really says is that you have arbitrarily many (in fact, an unknown number of) message types; if that were the case, then how would you be able to write code to process them?

So, ignoring that statement for a bit, we are led to two possibilities with RabbitMQ out of the box:

  1. Use a direct exchange and publish your messages using the type of message as a routing key. Then, have your various consumers subscribe to only the message types that they can process. This is the most common message processing pattern.

  2. Use a topic exchange, as you have, and come up with some sort of external de-duplication logic (perhaps memcached), where messages are checked against it and discarded if another consumer has started to process it.

Now, neither of these deals explicitly with the round-robin requirement. Since it was not explained why or how this was important, it is assumed that it can be ignored. If not, further definition of the problem space is required.