0
votes

I have a customer with a 5 node type Service Fabric cluster which is replicated into a DR facility.

They would like to keep the cluster offline (de-allocated) and start the cluster at the point invoking their DR procedure.

Only a single Node Type contains services which are stateful and the Service Fabric Services are contained in their own Node Type.

With the exception of the System Node type the shutdown script needs to do the following:

  1. Disable Service Fabric Nodes with Restart Intent
  2. Stop the Virtual Machine Scale Set

and do this in the following order

  1. Stateless Service Node Types (These feed the stateful services)
  2. Stateful Service Node Types
  3. System Node Type

The startup script showed be reverse so

  1. System Node Type
  2. Stateful Service Node Types
  3. Stateless Node Types

With the exception of the System Node Type

Enable-ServiceFabricNode

Can anyone see any problems / danger with this approach ?

1

1 Answers

0
votes

In general your approach is ok. However from my experience using Active-Passive pattern, I suggest you leave the passive cluster on with smaller SKU instead of de-allocated.

The most important reason is when DR procedure is required, you want to the cluster to be on ASAP. Time to provision the cluster is not a short and there is no guarantee that it will success.

The second reason you also want to check your passive site is working correctly in advance before DR procedure is trigger. For example: Connection string for passive site is correct.

Hope that helps