0
votes

Supposing we have an AWS FIFO SQS queue and two message producers A and B. Each message is sent with a group id that equals to producer's name. In other words, producer A adds group id "A" to each message, producer B adds group id "B" to each message. We also have 3 consumers X, Y and Z consuming messages with visibility timeout. Let's assume that there are 5 messages in the queue—three messages from producer A and two from producer B. See the following image

Considering given conditions we're going to have the following workflow:

One of the consumers, for example X, receives message 1 with group id B from the queue, which makes this message and all other messages with group B invisible until message 1 is processed and deleted from the queue.

Then the other consumer, for example Y, receives message 2 with group id A which makes message 2 and all other messages with group A invisible until message 2 is processed and deleted from the queue.

Now we have consumer Z not being able to consume any messages because group A is blocked by the processed message 2 and group B is blocked by the processed message 1.

Is there a technique allowing consumer Z to consume the next message from the queue in the given situation?

UPDATE 1: Why am I using FIFO queue and group ids?

Let's assume that producers A and B represent two users and a Simple Queue is used instead of FIFO. There are also no group ids attached to messages.

Consider a scenario when producer A sends a hundred messages to the queue and immediately after, producer B sends only one message to the queue as well. This one message of producer B has to wait until all the messages of A are processed, which isn't good. We need to load balance between messages of A and B despite the fact that A has hundred messages and B has only one.

For that let's try to add group ids and, since only FIFO queues support them, we must replace Simple Queue by FIFO one. Now the above problem is solved. When any producer's A message is in flight, one of the consumers will receive message of producer B, even if this message is in the back of the queue. We now load balance between A and B.

The problem however arises when all the groups have messages in flight (the queue looks empty in this case), but we have more available consumers at the moment not being able to work, which is also not too good.

UPDATE 2: Suggested possible solutions.

Multiple group ids per producer

Say we have 10 consumers and only one producer A. Lets add numbers from 1 to 10 to each message group id plus some unique id representing a batch if 10 messages, so we'll have group ids "A1-batch1", "A2-batch1", "A3-batch1" and so on till "A10-batch1". If producer A has more messages, we increase batch number and generate group ids for another ten and then for another ten. Now each consumer guaranteed to receive a message, which is great. But if producer B sends now one message, the balance ratio between producer A and B will be 10 to 1 in the worst case, which isn't great. Also, consumers are scalable horizontally, so producers have to know current approximate number of consumers.

Separate queue per producer

Producers are users currently working with the service. We'll have to create a Simple Queue when a user connects to the service and notify consumers about the added queue. Consumers will have to pull consecutively each currently existed queue and should be able to receive new messages even if there are messages in flight. In this case load balancing is fine. This solution adds complexity to the architecture, but should work. Unless I missed some technical limitation.

1
If you want to grab another message that has the same Group ID as a message currently in-flight, it could result in messages with the same Group ID being processed out-of-order. So why are you using a FIFO queue?John Rotenstein
Thanks for the reply. I updated the question with the explanation of why am I using FIFO queue. Hope it'll explain my logic.Mikhail
Will you only ever have 2 producers and 3+ consumers? Or might you have many more producers in future? If a producer is a "customer", then your current method is perfectly acceptable but suffering from only having 2 producers at the moment. If there are more than 2 producers, you would not have this 'blocking' behaviour.John Rotenstein
2 producers and 3 consumers are for simplicity of explanation. There can be any number of producers and receivers and sometimes there will be less producers than consumers. FIFO blocking behaviour does not allow to process multiple messages from one producer at a time and Simple Queues lack group ids. I was under the impression that my case should be quite common and thus solved by someone without compromise.Mikhail
Your use-case is a bit unusual, since you want the functionality of Message Group IDs, but don't require FIFO. It's a little like trying to load-balance messages. It is usually solved by having separate queues, such as a high-priority queue and a normal-priority queue, with consumers processing the high-priority queue first. However, imagine a scenario where Producer A puts 100 messages on the queue, and then Producer B puts 100 messages on the queue, which message should be pulled next? You are seeking a "balanced pull" rather than FIFO. That's an unusual case.John Rotenstein

1 Answers

0
votes

The queue will operate as you describe, and this is intentional.

There are only two different Group IDs. If both Group IDs are in-flight, then no other messages can be retrieved.

If this is causing a problem for you, then you are most probably using Group IDs incorrectly.

A Group ID basically says "Please process this group of messages in-order". So, if one message is still being processed, the SQS FIFO queue prevents another message from the same Group ID being retrieved and processed. The fact that you want to grab another message with the same Group ID suggests to me that you do not want that group of messages processed in-order, so you should be using a different Group ID.

By using a Group ID linked to a producer, and having only two producers, you will only ever have two consumers processing the queue.