Is there a way to repartition the input topic in Kafka streams?

Question

I have a topic keyed by byte[], I want to repartition it and process the topic by another key in a field in the message body.

I find there is KGroupedStream and groupby function. But it asks for an aggregation function to convert to a KTable/KStream. I don't need an aggregate. I just want to repartition and process the output.

Matthias J. Sax Matthias J. Sax · Accepted Answer · 2018-03-31T04:07:34

Yes you can. You set a new key and afterwards pipe the data through another topic.

// repartition() will create the required topic automatically for your,
// with the same number of partitions as your input topic;
//
// it's also possible to set the number of partitions explicitly to scale in/out
// via `repartitioned(Repartitioned.numberOfPartitions(...))`
KStream stream = ...
KStream repartionedStream = stream.selectKey(...)
                                  .repartition();

// older versions:
//
// using `through()` you need to create the use topic manually,
// before you start your application
KStream stream = ...
KStream repartionedStream = stream.selectKey(...)
                                  .through("topic-name");

Note, that you need to create the topic you use in through() before you start the application with the desired number of partitions.

Is there a way to repartition the input topic in Kafka streams?

2 Answers