0
votes

I have an app service plan, which scales in the following way:

  • Range: 2-20 instances
  • CPU > 20%: scale to 20 instances
  • CPU < 15%: scale to 2 instances

The actual number of instances that I see over time is the following:

  • 20 instances
  • (CPU ~ 0%)
  • scale down to 2 instances
  • (load started: CPU > 50%)
  • up to 20 instances (in ~5 mins)
  • (load finished: CPU ~ 0%)
  • down to 15 instances (in ~5 mins)
  • down to 12 instances (in ~5 mins)
  • down to 5 instances (in ~5 mins)
  • down to 2 instances (in ~5 mins)

How to make it scale to 2 instances instantly, without additional steps in-between?

Also, sometimes, the scale UP from 2 to 20 instances fails, and 2 instances remain at huge load. Is it possible to tell azure to scale to as many instances as possible, instead of leaving users with 2 instances during huge load?

Update: add scale settings JSON.

Update: add scale settings screenshots.

Overall scale settings: scale settings and chart

How it scales up: scale up

How it scales down: scale down

{
  "id": "...",
  "name": "...",
  "type": "Microsoft.Insights/autoscaleSettings",
  "location": "westus",
  "tags": {
    "$type": "...",
    "...": "Resource"
  },
  "properties": {
    "profiles": [
      {
        "name": "Default",
        "capacity": {
          "minimum": "2",
          "maximum": "20",
          "default": "2"
        },
        "rules": [
          {
            "metricTrigger": {
              "metricName": "CpuPercentage",
              "metricNamespace": "",
              "metricResourceUri": "...",
              "timeGrain": "PT1M",
              "statistic": "Average",
              "timeWindow": "PT5M",
              "timeAggregation": "Maximum",
              "operator": "GreaterThan",
              "threshold": 20
            },
            "scaleAction": {
              "direction": "Increase",
              "type": "ExactCount",
              "value": "20",
              "cooldown": "PT1M"
            }
          },
          {
            "metricTrigger": {
              "metricName": "CpuPercentage",
              "metricNamespace": "",
              "metricResourceUri": "...",
              "timeGrain": "PT1M",
              "statistic": "Average",
              "timeWindow": "PT5M",
              "timeAggregation": "Average",
              "operator": "LessThan",
              "threshold": 15
            },
            "scaleAction": {
              "direction": "Decrease",
              "type": "ExactCount",
              "value": "2",
              "cooldown": "PT1M"
            }
          }
        ]
      }
    ],
    "enabled": true,
    "name": "...",
    "targetResourceUri": "...",
    "notifications": [
      {
        "operation": "Scale",
        "email": {
          "sendToSubscriptionAdministrator": false,
          "sendToSubscriptionCoAdministrators": false,
          "customEmails": [
            "..."
          ]
        },
        "webhooks": null
      }
    ]
  }
1

1 Answers

1
votes

You can custom the Autoscale settings of your Web Apps to achieve your requirement.

If you are using C#, you can leverage the Azure Monitoring Services Management Library. Otherwise, you also can implement with the REST APIs in your using language to manage the Autoscale settings. You can refer to https://msdn.microsoft.com/en-us/library/azure/dn510367.aspx for the detailed description of the elements of the settings.

Please refer to https://github.com/Azure/azure-content/blob/master/articles/best-practices-auto-scaling.md for more info.

update

According the last paragraph of Use Azure Autoscale:

When you configure multiple policies and rules, they could conflict with each other. Autoscale uses the following conflict resolution rules to ensure that there is always a sufficient number of instances running:

  • Scale out operations always take precedence over scale in operations.
  • When scale out operations conflict, the rule that initiates the largest increase in the number of instances takes precedence.
  • When scale in operations conflict, the rule that initiates the smallest decrease in the number of instances takes precedence.