We are having an issue when using NHibernate with distributed transactions.
Consider the following snippet:
//
// There is already an ambient distributed transaction
//
using(var scope = new TransactionScope()) {
using(var session = _sessionFactory.OpenSession())
using(session.BeginTransaction()) {
using(var cmd = new SqlCommand(_simpleUpdateQuery, (SqlConnection)session.Connection)) {
cmd.ExecuteNonQuery();
}
session.Save(new SomeEntity());
session.Transaction.Commit();
}
scope.Complete();
}
Sometimes, when the server is under extreme load, we'll see the following:
- The query executed with cmd.ExecuteNonQuery is chosen as a deadlock victim (we can see it in SQL Profiler), but no exception is raised.
- session.Save fails with the error message, "The operation is not valid for the state of the transaction."
- Every time this code is executed after that, session.BeginTransaction fails. The first few times, the inner exception varies (sometimes it is the deadlock exception that should have been raised in step 1). Eventually it stabilizes to "The server failed to resume the transaction. Desc:3800000177." or "New request is not allowed to start because it should come with valid transaction descriptor."
If left alone, the application will eventually (after seconds or minutes) recover from this condition.
Why is the deadlock exception not being reported in step 1? And if we can't resolve that, then how can we prevent our application from temporarily becoming unusable?
The issue has been reproduced in the following environments
- Windows 7 x64 and Windows Server 2003 x86
- SQL Server 2005 and 2008
- .NET 4.0 and 3.5
- NHibernate 3.2, 3.1 and 2.1.2
I've created a test fixture which will sometimes reproduce the issue for us. It is available here: http://wikiupload.com/EWJIGAECG9SQDMZ