1
votes

Have a few design questions, would appreciate feedback based on your experiences with Kafka and KafkaJS(any such library):

  1. Is partition a way to scale in Kafka? If I create 3 partitions and only 1 consumer, do I loose messages with those 2 non utilized partitions? And if I spin up 2 new consumers, does KafkaJS manage the assignment for new consumers from dedicated partitions? Is eachBatch the only way to implement parallel processing in consumers, can it be done with eachMessage and control the rate of messages to process?
  2. What is the recommended way to scale consumers? Partition/Async Parallel/Increase consumer nodes, etc.? Currently, I have 1 node consuming ~30 messages per min, my goal is to scale the consumer because the expected rate could be around ~2000 upwards.
1
~30 messages per min is really slow for a single consumer... But, yes, partitions is the scaling factor and 1:1 consumers per partition would consume the fastestOneCricketeer

1 Answers

3
votes

I'll try to give a general answer to your questions:

  • Is partition a way to scale in Kafka?

    • Yes, partitions will allow to split data and scale horizontally
  • If I create 3 partitions and only 1 consumer, do I loose messages with those 2 non utilized partitions?

    • No, consumer will read from all 3 partitions
  • And if I spin up 2 new consumers, does KafkaJS manage the assignment for new consumers from dedicated partitions?

    • Yes, when spinning new consumers, they will communicate and there will be a rebalance process to assign 1 partition to each consumer
  • Is eachBatch the only way to implement parallel processing in consumers, can it be done with eachMessage and control the rate of messages to process?

    • Unfortunately, I don't have enough knowledge to answer it :(
  • What is the recommended way to scale consumers?

    • Best way is to have 1:1 relation between consumers and partitions, so spin up new consumers when needed. Why? Basically, it is simpler to handle than concurrency