We have our kafka running on Azure HDInsight provisioned in East US and our hot-stand by region is West US.
How do I configure the Azure HDInsight to support the disaster recovery with automatic failover? Would it impact the connection string?
Azure HDInsight was developed with a unique architecture for ensuring high availability (HA) of critical services. Some components of this architecture were developed by Microsoft to provide automatic failover. Other components are standard Apache components that are deployed to support specific services.
This article explains the architecture of the HA service model in HDInsight, how HDInsight supports failover for HA services, and best practices to recover from other service interruptions.
For Azure HDInsight Kafka clusters, you can use Kafka's mirroring feature to replicate Apache Kafka topics with Kafka on Azure HDInsight.
What is Kafka mirroring?
Kafka's mirroring feature makes it possible to maintain a replica of an existing Kafka cluster.
How Apache Kafka mirroring works?
Mirroring works by using the MirrorMaker tool (part of Apache Kafka) to consume records from topics on the primary cluster and then create a local copy on the secondary cluster. MirrorMaker uses one (or more) consumers that read from the primary cluster, and a producer that writes to the local (secondary) cluster.
The most useful mirroring setup for disaster recovery utilizes Kafka clusters in different Azure regions. To achieve this, the virtual networks where the clusters reside are peered together.
The following diagram illustrates the mirroring process and how the communication flows between clusters:
For more details, refer Use MirrorMaker to replicate Apache Kafka topics with Kafka on HDInsight and Big data streaming: Choices for high availability and disaster recovery on Microsoft Azure