I wasn't able to add the comment as it is quite long.
What you have mentioned in your comment that “we need an equal number of partitions for consumer application” is correct. However, it is only applicable if all the consumers(in your case its 5) comes under the same Consumer group.
For example, a topic T has 5 partitions, now suppose we create a consumer C1 with consumer group G1. Consumer c1 will get messages from all 5 partitions of Topic T. Then, we add consumer c2 under the same Consumer group G1. c1 will consume from 3 partitions and c2 will consume from the remaining 2 (It could be vice versa). Now what you have mentioned – “one partition per consumer application ” is an ideal scenario in this situation where 5 consumers under the same consumer group (G1) can consume from all 5 partitions parallel. This concept is called scalability.
Now, in your case you need the same data to be read 5 times because you have 5 consumers. In this case, instead of publishing the same messages to 5 partitions and then consume the same messages from all 5 consumers, you can write a simple producer app that publishes the data on a topic with 1 partition. Then, your 5 consumer apps can consume the same data independently I.e.I told you to assign all your consumer applications with random consumer-group names so that it will consume the messages independently ( as well as committing the offset).
Below the code snippet. Two consumer consuming messages from the same Topic(1 partition) parallelly:
Consumer 1:
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, UUID.randomUUID().toString());
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.ByteArrayDeserializer");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
KafkaConsumer consumerLiveVideo = new KafkaConsumer(props);
consumerLiveVideo.subscribe(Collections.singletonList(topicName[0]));
Consumer 2:
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, UUID.randomUUID().toString());
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.ByteArrayDeserializer");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
KafkaConsumer consumerLiveVideo = new KafkaConsumer(props);
consumerLiveVideo.subscribe(Collections.singletonList(topicName[0]));
You have also asked about the correct approach, according to me a single consumer application is all you need. Also, don’t mix the concepts of replication and scalability in Kafka as both of these are very critical.
Also, you have said about the critical data, you can read about Producer configuration parameter acks( use parameter acks =1 or acks=all based on your scenario).
For more details about the Scalability, Replication, Consumer Groups, Consumer/Producer/Brokers/Topics, please go through chapters 1-5 of Kafka The Definitive Guide.