We have a service fabric cluster with one scale set (primary) with 5 nodes. There was a memory leak in one of our services which drained all of the available memory on the nodes and eventually other services failed. For instance some Powershell commands don't work now. In the Service Fabric Explorer everything is healthy and we don't have any errors or warnings. Is it possible to restart the machines and what is the best way to do it so we could restore the machines to their initial state where all of the services are working?
In the scale set when scaling down it removes the node with the highest index, so it won't help to follow the documentation, scale up and then remove the nodes that are faulty.
What would happen if we restart the scale set nodes one buy one? I see that service fabric handles it - disables the node and activates it afterwards. But from the documentation in silver tier we need to have 5 nodes up and running all the time. So before restarting any of the nodes should we scale up, add one more node and then proceed with the restart?