0
votes

My goal is to upgrade Service Fabric VMSS OS from 2016 to 2019.

Followed the Microsoft document on Scale up a Service Fabric cluster primary node type

  • Deployed the initial cluster with two node types and two-scale sets (one scale set per node type) using these sample templates and parameter files. Both scale sets are size Standard D2_V2 and running Windows Server 2012 R2 Datacenter
  • Deployed a new scale set to the primary node type using these samples template and parameters files. The new scale set VMs are size Standard D4_V2 and run Windows Server 2016 Datacenter with Containers

Facing the following issue

  1. As per the document, the new scale set to be part of the service fabric cluster, but the new scale set didn't reflect in service fabric explore.

Once the VMSS is part of the service fabric cluster, will be disabling the windows 2012 nodes scale set

Any idea? (or) any other alternative to performing VMSS OS upgrade from windows 2016 to windows 2019

1
impossible to tell what you did wrong without seeing your templates4c74356b41
I didn't modify any in the template, modified parameters file related to the certificate. Rest remains as such. As per the document after deploying Windows 2016 VMSS. The new VMSS should be part of current SFC. But it didn't reflectRajakumar Babu
did you see this answer?LoekD
I tried with Update-AzVmss -ResourceGroupName "RG Name" -VMScaleSetName "VMSS NAME" -ImageReferenceSku "2016-Datacenter-with-Containers" -ImageReferenceVersion "Latest". Seed node hunged in "Disabling" state. Waitied for long time. No change.Rajakumar Babu

1 Answers

1
votes

Microsoft reference link on Scale up a Service Fabric cluster primary node type

My findings on the above query. I have successfully upgrade Service Fabric Cluster VMSS OS from 2016 to 2019

-In the ARM template new newly created VMSS is not part of Service Fabric Cluster. Following changes performed under nodeTypes

"managementEndpoint": "[concat('https://',reference(concat(parameters('lbIPName'),'-','0')).dnsSettings.fqdn,':',parameters('nt0fabricHttpGatewayPort'))]",
            "nodeTypes": [
                {
                    "name": "[parameters('vmNodeType2Name')]",
                    "applicationPorts": {
                    *
                    *
                },

When you deploy the ARM template with the above changes, the newly created VMSS will be part of the existing service fabric cluster.

-Connect service fabric cluster using following command

$clusterName = "Cluser-URL:19000"
$thumb = "xxxxxxxxxxx"    
Connect-ServiceFabricCluster `
-ConnectionEndpoint $clusterName `
-KeepAliveIntervalInSec 10 `
-X509Credential `
-ServerCertThumbprint $thumb  `
-FindType FindByThumbprint `
-FindValue $thumb `
-StoreLocation CurrentUser `
-StoreName My

-Disable service fabric cluster node which needs to delete (i.e 2016 VMSS)

$nodeNames = @("_NTvm1_0","_NTvm1_1","_NTvm1_2","_NTvm1_3","_NTvm1_4")
Write-Host "Disabling nodes..."
foreach($name in $nodeNames){
Disable-ServiceFabricNode -NodeName $name -Intent RemoveNode -Force
}

By successfully executing the above command initially, nodes will be Disabling status after some time it will be Disabled status. This can be monitored using service fabric explorer

-The next step is to remove the VMSS which be disabled in our previous step

$scaleSetName = "NTvm1"
$resourceGroupName = "RG-NAME"
Remove-AzVmss `
-ResourceGroupName $resourceGroupName `
-VMScaleSetName $scaleSetName `
-Force
Write-Host "Removed scale set $scaleSetName"

-By this time service fabric explorer ends with page not found error. Don't panic. Need to change the load balance settings to the newly created VMSS

$lbname="Newly Created LB Name"
$oldPublicIpName="Old LB PublicIP"
$newPublicIpName="New LB PublicIP"

$oldprimaryPublicIP = Get-AzPublicIpAddress -Name $oldPublicIpName  -ResourceGroupName $groupname
$primaryDNSName = $oldprimaryPublicIP.DnsSettings.DomainNameLabel
$primaryDNSFqdn = $oldprimaryPublicIP.DnsSettings.Fqdn
Remove-AzLoadBalancer -Name $lbname -ResourceGroupName $groupname -Force
Remove-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname -Force

-Need to update the DNS settings

settings of Public IP address related to old Primary Node Type
$PublicIP = Get-AzPublicIpAddress -Name $newPublicIpName  -ResourceGroupName $groupname
$PublicIP.DnsSettings.DomainNameLabel = $primaryDNSName
$PublicIP.DnsSettings.Fqdn = $primaryDNSFqdn
Set-AzPublicIpAddress -PublicIpAddress $PublicIP

Once this is done we are good to go

-Check the service fabric health status using Get-ServiceFabricClusterHealth command

NOTE Make sure your cluster reliability level set to "Silver". Microsoft recommending this for the production environment.