0
votes

We are currently trying to migrate Confluent replicator to Apache Open Source Mirror Maker v2.0. We are facing an issue where the messages which are already replicated by replicator is getting replicated again when the mirror maker is started on the same topic. This should not happen as messages are getting duplicated at the target cluster. Here are more details:

  1. RCA: replicator assign a consumer group for replicating messages. This consumer group maintains the offset of the source topic. But we are not able to assign same consumer group to the Consumer config in mirror maker 2.
  2. Mirror Maker 1.0 : working as same consumer group can be assigned in consumer.properties file and the messages are picked right after where replicator was stopped.
  3. Tried running and configuring source.cluster.consumer.group.id in mirror maker 2.0 in all available options (in cluster mode, in connect-standalone and connect-distributed mode) but mirror maker 2.0 is assigning consumer group id as null while replicating messages.

Any pointers if anyone has done same and tried to maintain the same offset with mirror maker 2.0.

1

1 Answers

0
votes

We got a crude way to solve this issue. Below are the high level steps:

  • Read the message from Replicator's internal topic for storing offsets. [connect-offsets]
  • This topic stores the offsets for all topics which is getting replicated in key:value pair . For e.g

Key : ["replicator-group",{"topic":"TEST","partition":0}]
Value: {"offset":24}

  • For each topic and partition,whenever a new message is replicated a new message with same key but increased offset is produced to the connect-offsets topic.
  • Convert the key of this message to Mirror Maker 2 format and produce it in the internal topic of MirrorMaker2. [You can change the internal topics in the mirrormaker2-connect-distributed.properties file] The format for mirror maker internal topic is:

Key: ["mirrormaker-group",{"cluster":"","partition":0,"topic":"TEST"}]
Value: {"offset":24}

  • After posting the message, once the mirror maker is restarted, it will read the internal topic to get the latest offset of that topic for which the message has to be replicated and this way we can ensure no duplicate messages are replicated.