I am currently using Google Cloud Dataflow and Apache Beam to consume messages from a Kafka topic that exists in two different Kafka clusters, with both clusters containing the same topic names but different data in the topics. The Kafka clusters are separated because they contain data from separate regions.
I am just wondering if it is possible to consume data from both of the clusters by listing all of the bootstrap servers for both clusters in a single KafkaIO.read Dataflow pipeline step?
.withBootstrapServers("CLUSTER1_SERVER:PORT,CLUSTER2_SERVER:PORT");
I was reading documentation regarding Kafka bootstrap servers and it wasn't clear to me if after connecting to a bootstrap server, messages would only be consumed from the first successful bootstrap server connection cluster, or if it would try all bootstrap servers provided and consume from all clusters found. If the former is the case, then I will need to create a second Dataflow pipeline to process the messages from the second cluster, but it would be much easier if I could process messages from both clusters in a single pipeline.
Any information would be greatly appreciated.