Azure metrics observed metric value vs actual metric value for Auto Scaling

Question

I have an autoscale rule that will not fire.

The out rule indicates if CPU Percentage is above 70% then add an instance. Time duration is 2 minutes and cool off period is 2 minutes.

When I built a Metrics chart to compare the actual CPU percent versus observed, I can clearly see that there are spikes in my CPU but the observed seems to be averaging it out over a longer time period, and I don't know why? What setting can I use in my scale rules to control the time period over which my rule averages?

SnehaAgrawal-MSFT SnehaAgrawal-MSFT · Accepted Answer · 2020-05-06T13:57:06

Thanks for asking question! You may want to investigate Best practices for Autoscale

Also, it’s important to understand the flapping process:

It is recommended to carefully choose different thresholds for scale-out and scale-in based on practical situations and don’t recommend autoscale settings like the examples below with the same or very similar threshold values for out and in conditions:

Take this as an example:

Increase instances by 1 count when Thread Count <= 600 Decrease instances by 1 count when Thread Count >= 600

Now please consider the following process:

Assume there are two instances to begin with and then the average number of threads per instance grows to 625.

Autoscale scales out adding a third instance.

Next, assume that the average thread count across instance falls to 575.

Before scaling down, autoscale tries to estimate what the final state will be if it scaled in. For example, 575 x 3 (current instance count) = 1,725 / 2 (final number of instances when scaled down) = 862.5 threads. This means autoscale would have to immediately scale-out again even after it scaled in, if the average thread count remains the same or even falls only a small amount. However, if it scaled up again, the whole process would repeat, leading to an infinite loop.

To avoid this situation (termed "flapping"), autoscale does not scale down at all. Instead, it skips and reevaluates the condition again the next time the service's job executes. This can confuse many people because autoscale wouldn't appear to work when the average thread count was 575.

Estimation during a scale-in is intended to avoid "flapping" situations, where scale-in and scale-out actions continually go back and forth. Keep this behavior in mind when you choose the same thresholds for scale-out and in.

We recommend choosing an adequate margin between the scale-out and in thresholds. As an example, consider the following better rule combination.

Increase instances by 1 count when CPU% >= 80

Decrease instances by 1 count when CPU% <= 60

To add to this the cool down period which means that if a scale down/up operation has happened, even if the rule is true (example - CPU remains high) the auto scale rule will not trigger. If the cool down is 2 min which means that if a scale down/up operation has happened, for the next 2 minutes, even if the rule is true, it will not be triggered due to cool down period.

Azure metrics observed metric value vs actual metric value for Auto Scaling

1 Answers