Microsoft reference link on Scale up a Service Fabric cluster primary node type
My findings on the above query. I have successfully upgrade Service Fabric Cluster VMSS OS from 2016 to 2019
-In the ARM template new newly created VMSS is not part of Service Fabric Cluster. Following changes performed under nodeTypes
"managementEndpoint": "[concat('https://',reference(concat(parameters('lbIPName'),'-','0')).dnsSettings.fqdn,':',parameters('nt0fabricHttpGatewayPort'))]",
"nodeTypes": [
{
"name": "[parameters('vmNodeType2Name')]",
"applicationPorts": {
*
*
},
When you deploy the ARM template with the above changes, the newly created VMSS will be part of the existing service fabric cluster.
-Connect service fabric cluster using following command
$clusterName = "Cluser-URL:19000"
$thumb = "xxxxxxxxxxx"
Connect-ServiceFabricCluster `
-ConnectionEndpoint $clusterName `
-KeepAliveIntervalInSec 10 `
-X509Credential `
-ServerCertThumbprint $thumb `
-FindType FindByThumbprint `
-FindValue $thumb `
-StoreLocation CurrentUser `
-StoreName My
-Disable service fabric cluster node which needs to delete (i.e 2016 VMSS)
$nodeNames = @("_NTvm1_0","_NTvm1_1","_NTvm1_2","_NTvm1_3","_NTvm1_4")
Write-Host "Disabling nodes..."
foreach($name in $nodeNames){
Disable-ServiceFabricNode -NodeName $name -Intent RemoveNode -Force
}
By successfully executing the above command initially, nodes will be Disabling status after some time it will be Disabled status. This can be monitored using service fabric explorer
-The next step is to remove the VMSS which be disabled in our previous step
$scaleSetName = "NTvm1"
$resourceGroupName = "RG-NAME"
Remove-AzVmss `
-ResourceGroupName $resourceGroupName `
-VMScaleSetName $scaleSetName `
-Force
Write-Host "Removed scale set $scaleSetName"
-By this time service fabric explorer ends with page not found error. Don't panic. Need to change the load balance settings to the newly created VMSS
$lbname="Newly Created LB Name"
$oldPublicIpName="Old LB PublicIP"
$newPublicIpName="New LB PublicIP"
$oldprimaryPublicIP = Get-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname
$primaryDNSName = $oldprimaryPublicIP.DnsSettings.DomainNameLabel
$primaryDNSFqdn = $oldprimaryPublicIP.DnsSettings.Fqdn
Remove-AzLoadBalancer -Name $lbname -ResourceGroupName $groupname -Force
Remove-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname -Force
-Need to update the DNS settings
settings of Public IP address related to old Primary Node Type
$PublicIP = Get-AzPublicIpAddress -Name $newPublicIpName -ResourceGroupName $groupname
$PublicIP.DnsSettings.DomainNameLabel = $primaryDNSName
$PublicIP.DnsSettings.Fqdn = $primaryDNSFqdn
Set-AzPublicIpAddress -PublicIpAddress $PublicIP
Once this is done we are good to go
-Check the service fabric health status using Get-ServiceFabricClusterHealth command
NOTE
Make sure your cluster reliability level set to "Silver". Microsoft recommending this for the production environment.