0
votes

Want to know best way for below case.

In micro services project: one application (producer) is publishing message and these messages are being consumed by many other downstream applications. Some messages are for 1st application, some are for 2nd, some are for 3rd and so on. What would be the best way so consumers consume only those messages which are meant for them.

Should I have 1 topic and number of partitions equal to no. of consumer and use key while publishing the message so each partition will be used by one particular consumer.

or 1 topic for each consumer and 1 partition or mutiple partition in each topic?

We should also consider if number of consumer increases in future then our solution should be able to handle it easily.

2
are the different applications reading overlapping messages from the topic? Or is it rather the case that part of the messages are only meant to be read by one single consumer?Michael Heil
Some messages need to be read by all the application but some are specific to a particular application (consumer)Nikhil Gupta

2 Answers

0
votes

You should not try to use partitions for routing to consumers, as partitions are for scalability and while you can be sure that the same key will go to the same partition you cannot know which consumer will consume from that partition at any time.

Hence number of partitions per topic is unrelated to your question and should be set to allow for future scaling needs.

Your choice is whether to use a single topic, topic per consumer application or something in between.

Single topic is fine as long as the consumers in each application are in a separate group to the consumers in other applications so that each application gets all messages. The downside is that each would have to filter out messages they are not interested in.

Topic per application may also be fine though it gives you the overhead of the producer having to know where to route a message to, which could lead to complex configuration.

Another approach is to have a topics based on some logical split based on the type of messages, where several applications may subscribe to a topic and some applications may subscribe to several topics and they may not be interested in all messages, but producers don't need to know who is consuming, just what logical area the message relates to (where it is for you to decide how to divide up topics and types of message)

0
votes

In my eyes this sounds like a good use case for using Kafka.

I recommend to not duplicate data and have all messages processes into one topic with multiple partitions. Processing data out of Kafka scales with the number of partitions, so I would set the number based on your expected amount of data and required throuput. In case you have requirements on the order of the messages in the partitioned topic you can use a custom partitioner within your producer to steer the distribution of the data into that topic. Be aware that ordering of the messages in Kafka is only guaranteed within a partition.

The consumers that subscribe to this Kafka topic should be as independent as possible from the producer. So I would not try to have the producer "do something" for the consumers. Ideally, the producer should not need to know anything about the consumers, especially if you plan to add more consumers in the future.

The consumers should also be independent of each other and all use different Consumer Groups. That way each consumer has the control to read the data from the topic independently. Also, in case of failure, a consumer can independently re-read the data of a Kafka topic from beginning without affecting any other consumer.