3
votes

I have two topics, one with 3 partitions and one with 48.

Initially i used the default assignor but i got some problems when a consumer(pod in kubernetes) crashed.

What happened was that when the pod came up again it reassigned the partition from the topic with 3 partitions and 0 from the topic with 48.

The two pods that did not crash got assigned 16 and 32 partitions from the topic with 48 partitions.

I've fixed this by using a round robin partition assignor but now i don't feel confident in how the partitions is distributed since i'm using kstream-kstream joins and for that we need to guarantee that the consumers are assigned to the same partition for all the consumer e.g. C1:(t1:p0, t2:p0) C2(t1:p1, t2:p1) etc..

One thing i thought of was that i could rekey the events coming in so they will repartition and then i might be able to guarantee this?

Or maybe i don't understand how the default partitioning work.. im confused

1
Actually, Kafka Streams does not allow to use a custom partition assignor. Probably your custom partition assignor is ignored. Additionally, according to the Kafka Streams docs, The input topics of the join (left side and right side) must have the same number of partitions.Bruno Cadonna
Ok, so i thought of merging the streams instead and then do a stream.through(newTopic) Just to merge the streams of data to one topic with 3 partitions, then ill filter them to two streams and do the joins? What do you think about this idea? The other solution i was thinking of was to create a microservice that just forwards the messages to one topic and then filter them to two new kstreams and do the joins.. any better ideas? And also the partition assignor, there is no way to change that when it comes to kstreams?kambo
It seems you can find an answer here: stackoverflow.com/questions/18202986/…dmkvl
@dmvkl: The thread you linked to covers how to customize the partitioner on producer side and how the default partition assignor on consumer side works. However, here the question is about customizing the partition assignor.Bruno Cadonna

1 Answers

8
votes

Kafka Streams does not allow to use a custom partition assignor. If you set one yourself, it will be overwritten with the StreamsPartitionAssignor [1]. This is needed to ensure that -- if possible -- partitions are re-assigned to the same consumers (a.k.a. stickiness) during rebalancing. Stickiness is important for Kafka Streams to be able to reuse states stores at consumer side as much as possible. If a partition is not reassigned to the same consumer, state stores used within this consumer need to be recreated from scratch after rebalancing.

[1] https://github.com/apache/kafka/blob/9bd0d6aa93b901be97adb53f290b262c7cf1f175/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java#L989