SQL Azure Geo Replication for non-redunancy purposes

Question

I'm just doing a little research on how to build a large-scale, globally accessible application on Azure.

There's lots of tech in there already to get your application as close to the consumer as possible.

CDN Edge Servers for static content shared around the world.
Cloud Services in different regions, using Traffic Manager to route a domain name to the nearest application host.

The bit I'm a bit confused about is the Database. If you are using SQL Azure, you have to specify a region to place it in. If my SQL Azure instance is in West Europe (Amsterdam), but my customers are in australia, and accessing the application via an instance in Australia (NSW), there's going to be some latency between the application talking to the database.

All the references I've seen about Geo Replication seem to be in the context of a Master-Slave redundancy setup. But I'm wondering is it feasible to have a Master-Master setup where each application instance talks to it's own SQL Azure master instance in the same Geo-Region, and then sql azure would take care of the Bi-Directional replication between them.

Well, does SQL Azure actually supports geo master-master and 'take care' of bi-directional replication? — Remus Rusanu
that's kinda what I'm trying to figure out. and if geo-replication doesn't support this, is there some other way to do multi-master replication. on a bit of "first principles knowledge find at the minute and looking for guidance" — Eoin Campbell
just stumbled something called "Data Sync" that might be what I'm looking for. — Eoin Campbell

Remus Rusanu Remus Rusanu · Accepted Answer · 2014-09-16T09:18:15

Active Geo-Replication for Azure SQL Database:

The Active Geo-Replication feature implements a mechanism to provide database redundancy within the same Microsoft Azure region or in different regions (geo-redundancy). Active Geo-Replication asynchronously replicates committed transactions from a database to up to four copies of the database on different servers. The original database becomes the primary database of the continuous copy. Each continuous copy is referred to as an active secondary database. The primary database asynchronously replicates committed transactions to each of the active secondary databases. While at any given point, the active secondary data might be slightly behind the primary database, the active secondary data is guaranteed to always be transactionally consistent with changes committed to the primary database. Active Geo-Replication supports up to four active secondaries, or up to three active secondaries and one offline secondary.

One of the primary benefits of Active Geo-Replication is that it provides a database-level disaster recovery solution. Using Active Geo-Replication, you can configure a user database in the Premium service tier to replicate transactions to databases on different Microsoft Azure SQL Database servers within the same or different regions. Cross-region redundancy enables applications to recover from a permanent loss of a datacenter caused by natural disasters, catastrophic human errors, or malicious acts.

Another key benefit is that the active secondary databases are readable. Therefore, an active secondary can act as a load balancer for read workloads such as reporting. While you can create an active secondary in a different region for disaster recovery, you could also have an active secondary in the same region on a different server. Both active secondary databases can be used to balance read only workloads serving clients distributed across several regions.

Notice that master-master is nowhere mentioned. Replicas are readable, never writable. So the question is really moot, as SQL Azure simply does not support what you wish.

An alternative would be application layer sharding and have each tenant connect to a proximity database, but that assumes that the data is disjoint (Australian customers do not look at South American items). See this answer here.

You can also investigate things like Cassandra, which does support what you want but is a major paradigm shift and you'll need to host it and manage it.

But you also have to ask: is master-master DB required to achieve low latency? Are writes occurring so often in your app? Read latency can easily be improved, that's why you have caching and CDNs for. Think about all Australian users reading this question. Is served from a geo-replicated database for disaster recovery, not from a master-master DB. See How StackOverflow scales SQL Server.

SQL Azure Geo Replication for non-redunancy purposes

2 Answers