0
votes

We have a use case where based on work item arriving on a worker queue, we would need to use the message metatdata to decide which Kafka topic to stream our data from. We would have maybe less than 100 worker nodes deployed and each worker node can have a configurable number of threads to receive messages from the queue. So if a worker has "n" threads , we would land up opening maybe kafka streams to "n" different topics. (n is usually less than 10). Once the worker is done processing the message, we would need to close the stream also. The worker can receive the next messsage once its acked the first message and at which point , I need to open a kafka stream for another topic. Also every kafka stream needs to scan all the partitions(around 5-10) for the topic to filter by a certain attribute.

Can a flow like this work for Kafka streams or is this not an optimal approach?

1

1 Answers

0
votes

I am not sure if I fully understand the use case, but it seem to be a "simple" copy data from topic A to topic B use case, ie, no data processing/modification. The logic to copy data from input to output topic seems complex though, and thus using Kafka Streams (ie, Kafka's stream processing library) might not be the best fit, as you need more flexibility.

However, using plain KafkaConsumers and KafkaProducers should allow you to implement what you want.