I have a test actor system with customer actors. It keeps track of state of a customer. Event messages and query messages are sent to the Coordinator. If no Customer actor exists for the message, the Coordinator creates one. Works fine and looks like this:
Now I want to add clustering. I want the system to be large enough for many customers and when one goes down, the messages can still be handled be the other nodes. So after reading and tinkering a bit, I thought I knew what to do, but I must be misunderstanding. This was my approach. I added a ConsistentHashingPool actor above the coordinator (using customer ID as the key). Then by making that pool cluster aware, it could distribute coordinators with their customers to other nodes. When still on one node it looks like this:
Worked like a charm. But then I added a second node to the cluster. It then looked like this:
This is not what I want. All customers have now two actors representing them. Some of the events end up on node 1, others on node 2. I clearly had some wrong expectations. I sort-of expected the cluster-aware pool to be present on both node "as one".
So what should I do to achieve my goal? Maybe there should be dispatcher roles and state roles? But I'd still want 2 dispatchers then. Should I look into this Cluster Singleton thing? There it states that the singleton can easily become a bottleneck. Preferably, I would have a router on every node, but they would share the routees, some local and some remote.