0
votes

I have recently started working on Apache Kafka. One thing which I keep on seeing on various blogs is that multiple topics are configured to the same listener.

My question is, Is it a good practice to do so? Let's say we are receiving 100 messages per sec on each topic. Message from each topic need different customization. And message from. Individual topics got to respective tables. Example: message from topic1 goes to topic_1 table.

It's a Spring Boot app that I'm working on. Also I would like to know what other challenges I might face going forward.

Update: Code sample

@KafkaListener(topics = "#{'${kafka-consumer.topics}'.split(',')}", groupId = "${kafka-consumer.groupId}")
    public void consume(KafkaConsumer<String, String> record) {
        int count = 0;
        ConsumerRecords<String, String> records = record.poll(1000);
        for (ConsumerRecord<String, String> data : records) {
            System.out.println(data.value());
            count++;
        }
        //record.listTopics()
        if(count > 0){
            record.commitAsync();
        }

    }
2
If you are using one listener to consume from multiple topics, it will slow down your consumption and processingDeadpool
Can you share your code? Why can't you just make multiple listeners for each topic? Typically you should only grab multiple topics if all records must be processed similarlyOneCricketeer

2 Answers

1
votes

My question is, Is it a good practice to do so?

It depends on the use case. In your example where a topic correlates to a table you should probably have a consumer per topic because if your consumer is consuming from many unrelated topics then the consumption will slow down. Consuming is less efficient than producing so the most common use case is to split your topic into multiple partitions and have multiple consumers per topic.

It would make sense to consume from multiple topics if the topics were related. There is a use case that Confluent has written a white paper about where they replicate data between data centers and the topics are prefixed with a data center ID. The consumers then consume from all topics with matching names but different data center IDs.

0
votes

it's not a good practice at all! because it's dramatically decrease the consumption rate, for obvious reasons...

but in some cases you'll have to use it, if you have many producers that can dynamically spawn and you'd like to retain a consumption of data from every one of them and yet have the capabilities to send data to a specific device

e.g

a lot of sensors where each one send to it's own topic with it's id like outgoing/12445646 and such

the consumer of the data from all those sensors will listen to outgoing/* topic but still can send a message directly to that sensor on channel like incoming/12445646

a separate outgoing channel can be very handy in case of traffic control where one can spawn dedicated consumers for high throughput channels and similar scenarios, or deal with a specific device without effecting the rest