0
votes

During development of our application we have introduced Azure SQL Geo-replication to improve user experience and responsiveness in various geographical locations.

Currently in testing we have two instances - a primary and secondary*, one of which is in US, the other in Europe.

The issue

Newly updated data from the primary database does not appear to be available in the secondary database even though sp_wait_for_database_copy_sync is called, which should ensure this.

Steps to reproduce issue

  1. User connects to a secondary instance
  2. User makes an update to the data (an 'UPDATE' or 'INSERT' that goes to the primary database)
  3. Transaction is committed
  4. The sp_wait_for_database_copy_sync procedure is called with appropriate parameters to ensure that the data will have replicated into the secondary instance when the update call unblocks
  5. Once update call unblocks, an attempt to pull data (including the newly updated or inserted) is made on the secondary database.
  6. The newly updated or inserted data is not contained within the result set - making it appear at though the database copy synchronization procedure did not ensure that data will be replicated on update call unblock

Technical implementation details

When an entity is about to be updated, transaction committed and data synchronization ensured, the following three lines of code are called, in order:

  1. ISession.Save(object obj) is used to save new entities
  2. ISession.Flush() is used to commit transaction
  3. ISession.CreateSQLQuery("EXEC sys.sp_wait_for_database_copy_sync @target_server = N\'secondary-server\', @target_database = N\'database\';").ExecuteUpdate() is used to execute the call blocking synchronization procedure

Questions

What could be wrong with the above three lines of code? The possible culprits I see are:

  1. ISession.Flush not executing synchronously, therefore when the blocking stored procedure is executed the transaction is not yet actually committed
  2. ISession.CreateSQLQuery("EXEC sys.sp_wait_for_database_copy_sync @target_server = N\'secondary-server\', @target_database = N\'database\';").ExecuteUpdate() is not actually being blocked.

Any ideas on how to troubleshoot the above two or perhaps some other issues you see would be greatly appreciated.

  • Primary and secondary in this case are in regards to geo-replication set-up.
4

4 Answers

0
votes

Azure SQL DB geo-replication is designed for user databases, not for the master database. Docs will be updated to make this clear.

0
votes

When you say "master" here you refer to the primary database in the geo-replication relationships, or the logical master database?

0
votes

NHibernate is synchronous. ISession.Flush is always synchronous. But if you are in a distributed TransactionScope (check System.Transactions.Transaction.Current?.TransactionInformation?.DistributedIdentifier ?? default(Guid) != Guid.Empty), then MSDTC kicks in and causes the actual commit in database to be asynchronous, eventually occurring after the scope disposal. Read more here..

I am not sure Azure accepts distributed transactions so that may not be the cause of your trouble.

If you are in a distributed transaction case, then you will have, depending on the locks caused by your queries/database settings/..., either to read back your committed data from database, relying on the transaction locks to block you until it gets actually committed, then perform your synchronization. Or if you are not locked, poll the database until seing your data (which is even worst for performances, of course), then perform your synchronization.

By the way, I do not know how Azure geo-replication works, but if it uses MSDTC too, then it could have itself those asynchronous issues.

About the SQL Exec not being actually blocked, then that would be a SQL Azure issue.

0
votes

I have forgotten to post here, but the issue was that the following procedure, while blocking, did not actually guarantee that once unblocked data will be available in all instances: sp_wait_for_database_copy_sync.