0
votes

I have a single Topic suppose name "Test". Suppose it has 4 partition P1, P2, P3, P4. Now, I am sending a message to suppose M1 from Kafka Producer. I want message M1 to get written in all partition P1, P2, P3, P4. Is it Possible? If yes then how I can do that? (I am new to this, I am using Kafka-Node to do this.)

1
Can you explain your usecase and the pupouse for doing this? Because it would mean to intentionally have message duplication, can you think in terms of consumption, consistency and replication what this would bring you?Armando Ballaci
@ArmandoBallaci I need to save data two different places for now just suppose two different files File1 and File2 . So if data will be available in 2 partition then both 2 consumer can read data parallel. So data will be saved in both files parallely.Anuresh Verma

1 Answers

2
votes

According the to documentation on a ProducerRecord you can specify the partition of a ProducerRecord. That way you can write the same message to multiple partitions of the same topic. The api for this look like this in Java:

ProducerRecord(String topic, Integer partition, K key, V value)

Overall your approach could look like this, although I am also questioning this approach of duplicating data and would rather re-consider a design change.

Producer<String, String> producer = new KafkaProducer<>(props);
 for (int part = 0; part < 4; part++)
     producer.send(new ProducerRecord<String, String>("Test", part, "Hello", "World!"));

 producer.close();

EDIT (after comment from OP with more background on use case):

From your comment I understand that you want to read the data in parallel and perform two different steps. Instead of writing the same message to two different partitions within the same topic I'd rather recommend to have the data stored only once in your topic (meaning in any partition). On the consumer side, you can make sure that your 2 consumer have a different ConsumerGroup (configuration: group.id). If they have two different ConsumerGroups they will be able to process the data in parallel. Kafka will not drop the message if it has been consumed, so it can be consumed by as many different(!) ConsumerGroups as you like. Data in Kafka is only deleted based on retention times or size which is configured at topic level and is independent of Producer/Consumer.