0
votes

When I create an AKS cluster using ARM templates with SystemAssigned in the identity field, there's a secondary resource group in the form MC_<rg_name>_<cluster_name>_location that's created. Inside this group is a managed identity in the form <cluster-name>-agentpool that Kubelet uses.

The arm template will be at the bottom of the post but the general structure is as follows. The reason I have them in separate deployments is because I deploy things at the subscription level:

Deployment A
  - Microsoft.ContainerService/managedClusters
Deployment B (dependsOn A)
  - Microsoft.Authorization/roleAssignments
    - contains reference to `nodeResourceGroup` with API version so I made explicit dependsOn for A

I plan on using Azure Pod Identity so I need to assign that managed identity two roles: ManagedIdentityOperator and VirtualMachineContributor. I have a variable that builds the path to this identity that is used in a reference:

"agentpool-account":[concat(subscription().id, '/resourceGroups/', variables('managedClusterResourceGroup'), '/providers/Microsoft.ManagedIdentity/userAssignedIdentities/', parameters('cluster-name'), '-agentpool')]
"principalId": "[reference(parameters('agentpool-account'), '2018-11-30', 'full').properties.principalId]",

However on the first time I deploy this template (i.e. on cluster creation), the RoleAssignment deployment will fail because of a Resource Group not found error. Looking at the deployment activity confirms that the RoleAssignment is deployed prior to the MC_group being created despite having an explicit dependsOn on the cluster deployment. And if I redeploy the template, it'll succeed because the MC_group now exists according to ARM.

I was wondering if anyone else has run into this issue and any tips on how to resolve this would be great. I've come across a link https://bmoore-msft.blog/2020/07/26/resource-not-found-dependson-is-not-working/ but doesn't seem to work for me.

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.1",
  "parameters": {
    "cluster-name": {
      "metadata": {
        "description": "The name of the cluster"
      },
      "type": "string"
    },
  },
  "resources": [
    {
      "apiVersion": "2019-10-01",
      "location": "centralus",
      "name": "test",
      "type": "Microsoft.Resources/resourceGroups"
    },
    {
      "apiVersion": "2020-06-01",
      "dependsOn": [
        "[resourceId('Microsoft.Resources/resourceGroups', 'test')]"
      ],
      "name": "cluster-deployment",
      "properties": {
        "expressionEvaluationOptions": {
          "scope": "outer"
        },
        "mode": "Incremental",
        "parameters": {},
        "template": {
          "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
          "contentVersion": "1.0.0.0",
          "outputs": {},
          "parameters": {},
          "resources": [
            {
              "apiVersion": "2019-06-01",
              "dependsOn": [],
              "identity": {
                "type": "SystemAssigned"
              },
              "location": "centralus",
              "name": "[parameters('cluster-name')]",
              "properties": {
                "addonProfiles": {
                  "azurePolicy": {
                    "enabled": false
                  },
                  "httpApplicationRouting": {
                    "enabled": false
                  }
                },
                "agentPoolProfiles": [
                  {
                    "availabilityZones": [
                      "1",
                      "2",
                      "3"
                    ],
                    "count": 3,
                    "maxPods": 110,
                    "mode": "System",
                    "name": "agentpool",
                    "osDiskSizeGB": 0,
                    "osType": "Linux",
                    "storageProfile": "ManagedDisks",
                    "type": "VirtualMachineScaleSets",
                    "vmSize": "Standard_D16s_v3"
                  }
                ],
                "apiServerAccessProfile": {
                  "enablePrivateCluster": false
                },
                "dnsPrefix": "[concat(parameters('cluster-name'), '-dns')]",
                "enableRBAC": true,
                "kubernetesVersion": "1.17.11",
                "networkProfile": {
                  "loadBalancerSku": "standard",
                  "networkPlugin": "kubenet",
                  "networkPolicy": "calico"
                }
              },
              "tags": {},
              "type": "Microsoft.ContainerService/managedClusters"
            }
          ],
          "variables": {}
        }
      },
      "resourceGroup": "test",
      "type": "Microsoft.Resources/deployments"
    },
    {
      "apiVersion": "2020-06-01",
      "dependsOn": [
        "cluster-deployment"
      ],
      "name": "identity-assignment",
      "properties": {
        "expressionEvaluationOptions": {
          "scope": "outer"
        },
        "mode": "Incremental",
        "template": {
          "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
          "contentVersion": "1.0.0.0",
          "outputs": {},
          "parameters": {},
          "resources": [
            {
              "apiVersion": "2017-09-01",
              "name": "[guid('test', 'ManagedIdentityOperator')]",
              "properties": {
                "principalId": "[reference(variables('agentpoolResourceId'), '2018-11-30', 'full').properties.principalId]",
                "roleDefinitionId": "[variables('managedIdentityOperatorRoleId')]",
                "scope": "[concat(subscription().id, '/resourceGroups/test')]"
              },
              "type": "Microsoft.Authorization/roleAssignments"
            },
            {
              "apiVersion": "2017-09-01",
              "name": "[guid('test', 'VirtualMachineContributor')]",
              "properties": {
                "principalId": "[reference(variables('agentpoolResourceId'), '2018-11-30', 'full').properties.principalId]",
                "roleDefinitionId": "[variables('virtualMachineContributorRoleId')]",
                "scope": "[concat(subscription().id, '/resourceGroups/test')]"
              },
              "type": "Microsoft.Authorization/roleAssignments"
            }
          ]
        }
      },
      "resourceGroup": "test",
      "type": "Microsoft.Resources/deployments"
    }
  ],
  "variables": {
    "agentPoolResourceId": "[concat(subscription().id, '/resourceGroups/', variables('managedClusterResourceGroup'), '/providers/Microsoft.ManagedIdentity/userAssignedIdentities/', parameters('cluster-name'), '-agentpool')]",
    "managedClusterResourceGroup": "[concat('MC_test_', parameters('cluster-name'), '_centralus')]",
    "managedIdentityOperatorRoleId": "[concat(subscription().id, '/providers/Microsoft.Authorization/roleDefinitions/f1a07417-d97a-45cb-824c-7a7467783830')]",
    "virtualMachineContributorRoleId": "[concat(subscription().id, '/providers/Microsoft.Authorization/roleDefinitions/9980e02c-c2be-4d73-94e8-173b1dc7cf3c')]"
  }
}
1

1 Answers

1
votes

Try the attached... I suspect the problem you ran into is the fact that the systemAssigned identity (aka MSI) has not globally replicated by the time the roleAssignment is performed. To fix that you can add the principalType property to the roleAssigment and that will force the assignment even though the principal (i.e. the MSI) may not be found yet. That's probably the simple fix.

Aside from that, I changed the template a bit to remove the second deployment so that might show a bigger diff than you actually need, but just a different way to approach it.

{
    "$schema": "https://schema.management.azure.com/schemas/2018-05-01/subscriptionDeploymentTemplate.json#",
    "contentVersion": "1.0.0.1",
    "parameters": {
        "cluster-name": {
            "type": "string",
            "defaultValue": "mc1"
        },
        "resourceGroupName": {
            "type": "string",
            "defaultValue": "test"
        }
    },
    "resources": [
        {
            "type": "Microsoft.Resources/resourceGroups",
            "apiVersion": "2019-10-01",
            "location": "centralus",
            "name": "[parameters('resourceGroupName')]"
        },
        {
            "type": "Microsoft.Resources/deployments",
            "apiVersion": "2020-06-01",
            "name": "cluster-deployment",
            "resourceGroup": "[parameters('resourceGroupName')]",
            "dependsOn": [
                "[subscriptionResourceId('Microsoft.Resources/resourceGroups', 'test')]"
            ],
            "properties": {
                "expressionEvaluationOptions": {
                    "scope": "inner"
                },
                "mode": "Incremental",
                "parameters": {
                    "cluster-name":{
                        "value": "[parameters('cluster-name')]"
                    }
                },
                "template": {
                    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
                    "contentVersion": "1.0.0.0",
                    "parameters": {
                        "cluster-name": {
                            "type": "string"
                        }
                    },
                    "variables": {
                        "managedIdentityOperatorRoleId": "[subscriptionResourceId('Microsoft.Authorization/roleDefinitions', 'f1a07417-d97a-45cb-824c-7a7467783830')]",
                        "virtualMachineContributorRoleId": "[subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '9980e02c-c2be-4d73-94e8-173b1dc7cf3c')]"
                    },
                    "resources": [
                        {
                            "type": "Microsoft.ContainerService/managedClusters",
                            "apiVersion": "2020-11-01",
                            "name": "[parameters('cluster-name')]",
                            "location": "centralus",
                            "identity": {
                                "type": "SystemAssigned"
                            },
                            "properties": {
                                "addonProfiles": {
                                    "azurePolicy": {
                                        "enabled": false
                                    },
                                    "httpApplicationRouting": {
                                        "enabled": false
                                    }
                                },
                                "agentPoolProfiles": [
                                    {
                                        "availabilityZones": [
                                            "1",
                                            "2",
                                            "3"
                                        ],
                                        "count": 3,
                                        "maxPods": 110,
                                        "mode": "System",
                                        "name": "agentpool",
                                        "osDiskSizeGB": 0,
                                        "osType": "Linux",
                                        "storageProfile": "ManagedDisks",
                                        "type": "VirtualMachineScaleSets",
                                        "vmSize": "Standard_D16s_v3"
                                    }
                                ],
                                "apiServerAccessProfile": {
                                    "enablePrivateCluster": false
                                },
                                "dnsPrefix": "[concat(parameters('cluster-name'), '-dns')]",
                                "enableRBAC": true,
                                "kubernetesVersion": "1.17.11",
                                "networkProfile": {
                                    "loadBalancerSku": "standard",
                                    "networkPlugin": "kubenet",
                                    "networkPolicy": "calico"
                                }
                            }
                        },
                        {
                            "type": "Microsoft.Authorization/roleAssignments",
                            "apiVersion": "2017-09-01",
                            "name": "[guid('test', 'ManagedIdentityOperator')]",
                            "properties": {
                                "principalId": "[reference(parameters('cluster-name'), '2020-11-01', 'full').identity.principalId]",
                                "roleDefinitionId": "[variables('managedIdentityOperatorRoleId')]",
                                "scope": "[resourceGroup().id]",
                                "principalType": "ServicePrincipal"
                            }
                        },
                        {
                            "type": "Microsoft.Authorization/roleAssignments",
                            "apiVersion": "2017-09-01",
                            "name": "[guid('test', 'VirtualMachineContributor')]",
                            "properties": {
                                "principalId": "[reference(parameters('cluster-name'), '2020-11-01', 'full').identity.principalId]",
                                "roleDefinitionId": "[variables('virtualMachineContributorRoleId')]",
                                "scope": "[resourceGroup().id]",
                                "principalType": "ServicePrincipal"
                            }
                        }
                    ]
                }
            }
        }
    ]
}

LMK if you have any questions about the other changes...