17
votes

The Azure Service Fabric appears to be focused on scenarios in which all data can fit within RAM and persistence is used as a backing store. Reliable Services are designed to store information in Reliable Collections, which use a log-checkpoint system where logged information is written into RAM. Meanwhile, for Reliable Actors, the default actor state provider is "the distributed Key-Value store provided by Service Fabric platform." This seems to indicate that the same limitations would apply.

There may, however, be situations in which one would like to use the Service Fabric for "hot data" but write "cold data" to some form of permanent storage. What are best practices for handling this transition?

In Orleans, this seems to be handled automatically, using a persistence store such as Azure tables. But it seems that a principal design purpose of the Service Fabric and the Reliable Collections are to avoid needing external services, thus enhancing data locality. The current documentation anticipates the possibility that one would want to move data into some permanent store for disaster recovery and analytics, but it does not discuss the possibility of moving data back and forth between persistence-backed in-memory actors and more permanent forms of storage.

A possible answer is that the Service Fabric already does this. Maybe a Reliable Dictionary has some built-in mechanism for switching between persistence-backed in-memory storage and permanent storage.

Or, maybe the answer is that one must manage this oneself. One approach might be for an Actor to keep track of how "hot" it is and switch its persistence store as necessary. But this sacrifices one of the benefits of the Actor model, the automatic allocation and deallocation of actors. Similarly, we might periodically remove items from the Reliable Dictionary and add it to some other persistence store, and then add them back. Again, though, this requires knowledge of when it makes sense to make the transition.

A couple of examples may help crystallize this:

(1) Suppose that we are implementing a multiplayer game with many different "rooms." We don't need all the rooms in memory at once, but we need to move them into memory and use local persistence as a backup once players join them.

(2) Suppose that we are implementing an append-only B-Tree as part of a database. The temptation would be to have each B-Tree node be a stateful actor. We would like hot b-trees to remain in memory but of course the entire index can't be in memory. It seems that this is a core scenario that is already implemented for things like DocumentDB, but it's not clear to me from the documentation how one would do this.

A related question that I found is here. But that question focuses on when to use Azure Service Fabric vs. external services. My question is on whether there is a need to transition between them, or whether Azure Service Fabric already has all the capability needed here.

2

2 Answers

14
votes

The Key-Value store state provider does not require everything to be kept in memory. This provider actually stores the state of all actors on the local disk and the state is also replicated to the local disk on other nodes. So the KVS store is considered a persistent and reliable store.

In addition to that, the state of active actors is also stored in memory. When an actor hasn't been used in a while, it gets deactivated and garbage collected. When this happens, the in-memory copy is freed and only the copy on disk remains. When the actor is activated again, the state is fetched from disk and remains in memory as long as the actor is active.

Also, KVS is not the only built-in state provider. We also have the VolatileActorStateProvider (http://azure.microsoft.com/en-gb/documentation/articles/service-fabric-reliable-actors-platform/#actor-state-provider-choices). This is the state provider that keeps everything in memory.

6
votes

The KvsActorStateProvider does indeed store actor state in a KeyValueStore which is a similar structure to the ReliableDictionary.

The first question I'd ask is whether you need to relegate old actors state to cold storage? The limitation of keeping everything in memory doesn't limit you to a total number of actors, but a total number per replica. So you must first consider the partitioning strategy so that your actors are distributed across a number of different replicas. As your demands grow you can then add more machines to the cluster and the ServiceFabric will orchestrate movements of the replicas to the new machines. For more information on partitioning of the Actor service, see http://azure.microsoft.com/en-gb/documentation/articles/service-fabric-reliable-actors-platform/

If you do want to use cold storage after some time, then you have a couple of options. Firstly, you could decorate your actors with a custom ActorStateProviderAttribute that returns your own implementation of an IActorStateProvider that can handle persistence as you decide.

Alternatively, you could handle it entirely within your Actor implementation. Hook into the Actor Lifecycle and in OnDeactivateAsync such that when the instance is garbage collected, or use an Actor Reminder for some specified time in the future, to serialise the state and store in cold storage such as blob or table storage and null out the State property. The ActivateAsync override can then be used to retrieve this state from offline storage and deserialise.