1
votes

1. Consuming concurrently on the same topic and same partition

Suppose I have 100 partitions for a given topic (e.g. Purchases), I can easily consume these 100 partitions (e.g. Electronics, Clothing, and etc...) in parallel using a consumer group with 100 consumers in it.

However, that is assigning one consumer to each subset of the total data on Purchases. What if I want just want to consume one subset of data with 100 consumers concurrently? For example, for all of my consumers, they just want to know Electronics partition of the Purchases topic.

Is there way they can consume this partition concurrently?

In general I just want all my consumers to receive the same data set concurrently.

From the information I've gathered, it seems to me that consumers CANNOT consume from replicas: Consuming from a replica

Can I produce the same data to multiple topics, like Purchase-1[Electronics] and Purchase-2[Electronics] so then I can consume them concurrently? Is this a recommended approach?

2. Producing concurrently on the same topic and same partition

When multiple producers are producing to the same topic and same partition, since we can only write to the partition leader and replicas are only there for fault-tolerance, does this mean there isn't any concurrency? (i.e. each commit must wait in line.)

1

1 Answers

2
votes
  1. If those 100 consumers belong to different consumer groups, they can consume from the same topic and partition simultaneously. In that case, you need to make sure each consumer is able to handle the load from the 100 partitions.
  2. Producers can produce to the same topic partition at the same time, but the actual order of messages written to the partition is determined by the partition leader.