This question is specifically for Service Fabric but the concept goes beyond this one Cluster system so feel free to pitch in without SF experience.
I'm trying to understand the Pros and Cons of making MinReplicaSetSize and TargetReplicaSetSize the same or different numbers for a Stateful service.
Let's say you have 10 Partitions and 3 Replicas. This means every Partition would have 3 Replicas (1 Primary and 2 Secondary). Lets say you have some Collection distributed across these 10 Partitions. There are 2 situations to consider:
MinReplica and TargetReplica = 3
On some Partition the Primary Replica fails. One of the Secondaries is promoted to Primary. Since Replica count = 2 < MinReplica then writes to this Partition fail/halt until the new Secondary is spooled up and brought up to speed with the others so that the count is 3 again. The obvious downside is the halting of writes.
MinReplica = 2 and TargetReplica = 3
On some Partition the Primary Replica fails. One of the Secondaries is promoted to Primary. Since the MinReplica count is still safe, the writing to the new primary continues and it updates the 1 left Secondary. While this is happening another Secondary is spooled up and then must be brought up to speed while writes continue. So what is the downside?
Is it possible for the new Primary to fail, the Secondary to promote to Primary and also fail, before the new Secondary is caught up to speed; therefore losing committed data? Is this the downside? And in a MinReplica=3 / TargetReplica=4 scenario even this wouldn't happen.
I ask because I commonly see these as equal numbers and got thinking about it.