The apache kafka documentation mentions the following :
If all the consumer instances have the same consumer group, then the records will effectively be load balanced over the consumer instances.
If all the consumer instances have different consumer groups, then each record will be broadcast to all the consumer processes.
this makes things a bit unclear for me when thinking about partitions, does that second statement mean that if i have multiple consumer groups, does that mean that each consumer in each group will read all the records in all partitions ?!!
Still the photo they used in the documentation does not agree with the above as per my humble understanding.
In fact i was reading through a great article, kafka in a nutshell and the quoted statements below conform much better with the photo provided in the documentation.
Consumers can also be organized into consumer groups for a given topic — each consumer within the group reads from a unique partition and the group as a whole consumes all messages from the entire topic. If you have more consumers than partitions then some consumers will be idle because they have no partitions to read from. If you have more partitions than consumers then consumers will receive messages from multiple partitions. If you have equal numbers of consumers and partitions, each consumer reads messages in order from exactly one partition.
I was hoping someone could shed some light on the above and explain clearly a scenario based on Apache's official documentation.