5
votes

We are running an NServiceBus-based service, using the NServiceBus.Host.exe host process.

Twice in production during the last few months the Windows Service has suddenly stopped, leaving the following event in the Application Event Log:

Application: NServiceBus.Host.exe Framework Version: v4.0.30319 Description: The process was terminated due to an unhandled exception. Exception Info: System.InvalidOperationException Stack: at System.Transactions.TransactionState.ChangeStatePromotedPhase0(System.Transactions.InternalTransaction) at System.Transactions.Phase0VolatileDemultiplexer.InternalPrepare() at System.Transactions.VolatileDemultiplexer.PoolablePrepare(System.Object) at System.Transactions.Oletx.OletxVolatileEnlistment.Prepare(System.Transactions.Oletx.OletxVolatileEnlistmentContainer) at System.Transactions.Oletx.OletxPhase0VolatileEnlistmentContainer.Phase0Request(Boolean) at System.Transactions.Oletx.OletxTransactionManager.ShimNotificationCallback(System.Object, Boolean) at System.Threading._ThreadPoolWaitOrTimerCallback.PerformWaitOrTimerCallback(System.Object, Boolean)

We got this error during a period with some minutes of network instability (e.g. lots of timeouts against the database, which are visible in our log4net log files)

Any ideas as to what is failing here?

We see no fatal errors in our log4net logfiles.

Versions:

  • Windows Server 2008 R2
  • .NET Framework 4.5.2
  • NServiceBus 4.7.5
  • NHibernate 3.3.3.4001 (used for saga, subscription and timeout persister)
  • SQL Server 2012
1
Looks like a DTC related error to me, it will probably bubble from NHibernate when the connection to the database is failing?Sean Farmar

1 Answers

3
votes

It seems the behavior you see is related to how NHibernate handles the TransactionCompleted event. This questions is also somehow related to this question.

The AdoNetWithDistributedTransactionFactory registers an anonymous delegate on the TransactionCompleted event. This event gets fired on a background thread by using ThreadPool.QueueUserWorkItem. If that operation throws an exception due to connectivity issues with your database server (i.ex. due to a network partition) this exception gets raised as an unobserved exception on the AppDomain. UnhandledExceptions tear down the AppDomain and therefore also the NServiceBus.Host.

The best possible workaround for that would be to register an UnhandledException handler on the current AppDomain like the following

AppDomain.CurrentDomain.UnhandledException += OnUnhandledException

private static void OnUnhandledException(object sender, UnhandledExceptionEventArgs e)
{
    LogManager.GetLogger(typeof(AppDomain)).Fatal(“Unhandled exception”, e.ExceptionObject as Exception);
}

More information see

When this fixed the intermediate problem it would make sense to find the root cause of the connection issues with your database server in combination with NHibernate