0
votes

Can you please give me a better understanding of how we can scale the stateless services without partitioning?

Say we have 5 nodes in a cluster and we have 5 instances of the service. On simple testing a node is behaving as sticky where all the requests I am sending are being served by only one node. In the scenario when we have high volume of requests that come in, can other instances be automatically used to serve the traffic. How do we handle such scale out situations in service fabric?

Thanks!

3

3 Answers

1
votes

Usually there's no need to use partitioning for stateless SF services, so avoid that if you can:

more on SF partitioning, including why its not normally used for stateless services

If you're using the ServiceProxy API, it will maintain sticky connections to a given physical node in the cluster. If you're (say) exposing HTTP endpoints, you'll have one for each physical instance in the cluster (meaning you'll end up talking to one at a time, unless you manually cycle thru them). You can avoid this by:

  1. Creating a new proxy instance for each call, which tends to be expensive if you do it alot (or manually cycle thru the list of instance endpoint URLs, which can be tedious and/or expensive)

  2. Put a load balancer in front of your cluster and configure all traffic from your clients to SF nodes to be forwarded thru that. The load balancer can be configured for Round-Robin, etc. style semantics:

Azure Load Balancer

Azure Traffic Manager

Good luck!

1
votes

You can query the request using the reverse proxy installed on each node. Using the https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-reverseproxy

The reverse proxy then resolve the endpoint for you. If you have multiple instances of the a stateless service then it will forward your request to a random one.

If during heavy load you can increase the instance count of your service and the proxy then include the new instances automatically.

1
votes

I will assume you are calling your services from outside your cluster. If yes, your problem is not specific for Service Fabric, it is Azure VMSS + LB.

Service Fabric runs on top of Virtual Machines Scale Set, these VMs are created behind a Load Balancer, when the client connects to your service, they are creating a connection through the load balancer to your service, whenever a connection is open, the load balancer assign one target VM for handling your request, and any request made from your client, while using the same connection(keep alive), will be handled by the same node, this is why your load goes to a single node.

LB won't round robin the requests because they are using the same connection, it is a limitation(feature) of the LB, to work around this problem, you should open multiple connections or use multiple clients(instances).

This is for default distribution mode(Hash-based). You have to check also the routing rules in the LB to check if the distribution mode is Hash-based(5 tuple= ip+port) or if it is IP affinity mode(ip only), otherwise multiple connections from same IP will still be linked to same node.

Source: Azure Load Balaner Distribution Mode