0
votes

I'm trying to create a pool of virtual machines built on my custom image. I've successfully created a custom image and added it to my batch account.

But when I try to create a pool, based on this image from the azure portal, I get an error.

There was an error encountered while performing the last resize on the pool. Please try resizing the pool again. Code: AllocationFailed

Message: Desired number of dedicated nodes could not be allocated

Details: Reason - The source managed disk or snapshot associated with the virtual machine Image Id was not found.

While creating a pool in the portal I use my image name, as there's no option to set an image id. But the image Id in the json is correct. And I can see the image listed in the portal in the correct batch account.

Here's my pool properties json:

{
  "id": "my-pool-0",
  "displayName": "my-pool-0",
  "lastModified": "2018-12-04T15:54:06.467Z",
  "creationTime": "2018-12-04T15:44:18.197Z",
  "state": "active",
  "stateTransitionTime": "2018-12-04T15:44:18.197Z",
  "allocationState": "steady",
  "allocationStateTransitionTime": "2018-12-04T16:09:11.667Z",
  "vmSize": "standard_a2",
  "resizeTimeout": "PT15M",
  "currentDedicatedNodes": 0,
  "currentLowPriorityNodes": 0,
  "targetDedicatedNodes": 1,
  "targetLowPriorityNodes": 0,
  "enableAutoScale": false,
  "autoScaleFormula": null,
  "autoScaleEvaluationInterval": null,
  "enableInterNodeCommunication": false,
  "maxTasksPerNode": 1,
  "url": "https://mybatch.westeurope.batch.azure.com/pools/my-pool-0",
  "resizeErrors": [
    {
      "message": "Desired number of dedicated nodes could not be allocated",
      "code": "AllocationFailed",
      "values": [
        {
          "name": "Reason",
          "value": "The source managed disk or snapshot associated with the virtual machine Image Id was not found."
        }
      ]
    }
  ],
  "virtualMachineConfiguration": {
    "imageReference": {
      "publisher": null,
      "offer": null,
      "sku": null,
      "version": null,
      "virtualMachineImageId": "/subscriptions/79b59716-301e-401a-bb8b-22edg5c1he4j/resourceGroups/resource-group-1/providers/Microsoft.Compute/images/my-image"
    },
    "nodeAgentSKUId": "batch.node.ubuntu 18.04"
  },
  "applicationLicenses": null
}

It seems like the error text has nothing to do with what actually is wrong. Has anyone encountered this error or now a way to troubleshoot this?

UPDATE

Packer json used to create the image (taken from here)

{
  "builders": [{
    "type": "azure-arm",

    "client_id": "ffxcvbd0-c867-429a-bxcv-8ee0acvb6f99",
    "client_secret": "cvb54cvb-202d-4wq-bb8b-22cdfbce4f",
    "tenant_id": "ae33sdfd-a54c-40af-b20c-80810f0ff5da",
    "subscription_id": "096da34-4604-4bcb-85ae-2afsdf22192b",

    "managed_image_resource_group_name": "resource-group-1",
    "managed_image_name": "my-image",

    "os_type": "Linux",
    "image_publisher": "Canonical",
    "image_offer": "UbuntuServer",
    "image_sku": "18.04-LTS",

    "azure_tags": {
        "dept": "Engineering",
        "task": "Image deployment"
    },

    "location": "West Europe",
    "vm_size": "Standard_DS2_v2"
  }],
  "provisioners": [{
    "execute_command": "chmod +x {{ .Path }}; {{ .Vars }} sudo -E sh '{{ .Path }}'",
    "inline": [
      "export DEBIAN_FRONTEND=noninteractive",
      "apt-get update",
      "apt-get upgrade -y",
      "apt-get -y install nginx",

        ... 

      "/usr/sbin/waagent -force -deprovision+user && export HISTSIZE=0 && sync"
    ],
    "inline_shebang": "/bin/sh -x",
    "type": "shell"
  }]
}
2
What format are you storing the image in, Managed Disk, Snapshot, Managed Image or are you using the new shared image gallery?Sam Cogan
@SamCogan I believe it's a managed disk, I created it with packer as per docs. And it's under Home - Images.sr9yar
ok, if it's been created with Packer it will be a managed disk imageSam Cogan
@sr9yar If the answer is helpful you can accept it.Charles Xu
@CharlesXu np, I was going to test it as well, as we're going to move to a paid subscription one of these days, but I think your answer is correct, since I didn't create a VHD, I was creating an image directly with packer.sr9yar

2 Answers

1
votes

With your issue, I did the test as you. The steps here:

  1. Create the managed image through Packer.
  2. Create the Batch Pool with the managed image in the same subscription and region.

And then I get the same error as you. Then I make another test that creates the image from a snapshot and then create the Batch Pool with the image. Luck! The pool works well.

In Azure you can prepare a managed image from snapshots of an Azure VM's OS and data disks, from a generalized Azure VM with managed disks, or from a generalized on-premises VHD that you upload.

Reference to this description, it seems the custom image cannot create through Packer. I'm not sure about this. But it really works. Hope this will help you.

Update

Take a look at the document Custom Images with Batch Shipyard. The description:

Note: Currently creating an ARM Image directly with Packer can only be used with User Subscription Batch accounts. For standard Batch Service pool allocation mode Batch accounts, Packer will need to create a VHD first, then you will need to import the VHD to an ARM Image. Please follow the appropriate path that matches your Batch account pool allocation mode.

In my test, I have followed the steps that Packer does to create the image. When the source VM exists, the custom image can be used normally for Batch Pool. But it will fail if you delete the source VM. So, as the description, the standard Batch Service just can use the image created from VHD file that Packer create and the VHD file should exist in the Pool lifetime.

0
votes

If your using a managed image then your imageReference section should look like this:

"imageReference": { "id": "/subscriptions/79b59716-301e-401a-bb8b-22edg5c1he4j/resourceGroups/resource-group-1/providers/Microsoft.Compute/images/my-image" },