1
votes

I am working on a application/Kafka Cluster which will be producing/consuming messages (around 100k a second) to a Topic. The message format is identical so my initial thoughts were to have a single topic for all messages.

However is there any benefits to Kafka to split the messages into multiple Topics? There is a logical separation which could be applied which could split the topic into multiple (10ish) topics.

Apart from the Producer/Consumer side of things. Does Kafka itself have any preferences around performance, redundancy, stability, management etc by having 1 large topic versus multiple smaller topics?

2

2 Answers

2
votes

Topic partitions are the usual means of parallelizing Kafka, however you could opt to split it into multiple topics as well if you wanted. But I would first look into the partition aspect of things. Here is a good Confluent article on how to pick the right number of partitions. Especially note that if you are partitioning on keys then adding partitions after the fact can result in split data, so think through it properly up front as best as you can.

0
votes

Parallelism in kafka depends on the number of partitions in a topic.There will be an increase in throughput of data as long as the number of partitions is optimal(unnecessarily large number of partitions will create overhead).By increasing the number of consumer you can streams message from partitions simultaneously