I have a partitioned topic, which has X
partitions.
As of now, when producing messages, I create Kafka's ProducerRecord
specifying only topic
and value
. I do not define a key
.
As far as I understand, my messages gonna be distributed evenly amongst partitions using default built-in partitioner.
On the other hand, I have a thread pool of Kafka consumers. Each Kafka consumer will be running in its own dedicated thread consuming messages from the topic. Each of those consumers is given the same group.id
. This will allow consuming messages in parallel. Every consumer will be assigned its fair share of partitions to read from.
I want my messages to be consumed in an orderly fashion. I know that Kafka guarantees the order of messages within a partition. So, as long as I come up with a proper key structure, I will have my messages partitioned in a way that they will end up in the same partition. In a way, message key groups messages and stores them in the partition.
Does it make sense?
Q: Is there a chance that due to a badly designed key I will get uneven partitions? One may receive way more records than the others. Can it impact in a bad way performance of my Kafka cluster? What are the best practices for message key design?