1
votes

We have a Service Fabric application with several actors and services. The actors & services successfully deploy and run on most of the development machines. On a few of the development machines, however, one of the services shows in an error status for no apparent reason.

In Service Fabric Explorer, the error icon shows on the Cluster all the way down to the partition. However, the node did not show that it was in an error state. Several minutes of waiting later, the node did show a warning icon and the following error message:

Unhealthy event: SourceId='System.RA', Property='ReplicaOpenStatus', HealthState='Warning', ConsiderWarningAsError=false. Replica had multiple failures during open. Error =System.TypeLoadException (-2146233054) Could not load type 'Microsoft.ServiceFabric.Data.ReliableStateManagerImpl' from assembly 'Microsoft.ServiceFabric.Data.Impl, Version=5.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' at Microsoft.ServiceFabric.Data.ReliableStateManager.get_Impl() at Microsoft.ServiceFabric.Data.ReliableStateManager.Microsoft.ServiceFabric.Data.IStateProviderReplica.Initialize(StatefulServiceInitializationParameters initializationParameters) at Microsoft.ServiceFabric.Services.Runtime.StatefulServiceBase.System.Fabric.IStatefulServiceReplica.Initialize(StatefulServiceInitializationParameters initializationParameters) at System.Fabric.ServiceFactoryBroker.CreateHelper[TFactory,TReturnValue](IntPtr nativeServiceType, IntPtr nativeServiceName, UInt32 initializationDataLength, IntPtr nativeInitializationData, Guid partitionId, Func3 creationFunc, Action2 initializationFunc, ServiceInitializationParameters initializationParameters)

I set breakpoints and found that every expected line of program.cs executes and it eventually gets to the Thread.Sleep(Timeout.Infinite). But not a single line of the failing service's code gets hit.

The service at issue is Stateful and shares an assembly with a Stateless Actor. I don't tend to put two actors/services in one dll but a colleague did and it works for most of the team. Don't know if this is a potential issue.

Everyone on the team was running SDK 1.5 when this occurred on two dev machines. I upgraded to SDK 2.0 (but didn't change the assembly references in Nuget to use the 2.0 assemblies). Same problem.

I did a search for the assembly 'Microsoft.ServiceFabric.Data.Impl' and found it under [Program Files]\Microsoft Service Fabric\bin\Fabric\Fabric.Code. It is version 5.0.135.9590.

My colleague, who has it working, has that same file as version 4.5.175.9590.

Any help or advice is greatly appreciated.

1

1 Answers

1
votes

Unfortunately I don't know the specifics, but I believe that once you've fully upgraded (latest SDK + latest nuget packages) the issue will go away.