0
votes

I hope you can shed some light on this.

I am facing the same issue as described here: Kubernetes deployment not scaling down even though usage is below threshold

My configuration is almost identical.

I have checked the hpa algorithm, but I cannot find an explanation for the fact that I am having only one replica of my-app3. Any hints?

kubectl get hpa -A 

NAMESPACE            NAME        REFERENCE              TARGETS            MINPODS   MAXPODS   REPLICAS   AGE
my-ns1               my-app1     Deployment/my-app1     49%/75%, 2%/75%    1         10        2          20h
my-ns2               my-app2     Deployment/my-app2     50%/75%, 10%/75%   1         10        2          22h
my-ns2               my-app3     Deployment/my-app3     47%/75%, 10%/75%   1         10        1          22h
kubectl top po -A

NAMESPACE             NAME                                                     CPU(cores)   MEMORY(bytes)              
my-ns1                pod-app1-8d694bc8f-mkbrh                                 1m           76Mi            
my-ns1                pod-app1-8d694bc8f-qmlnw                                 1m           72Mi            
my-ns2                pod-app2-59d895d96d-86fgm                                1m           77Mi            
my-ns2                pod-app2-59d895d96d-zr67g                                1m           73Mi            
my-ns2                pod-app3-6f8cbb68bf-vdhsd                                1m           47Mi 
2
why would you have more? it sits at 1m CPU? also, not related to AKS at all. also read the answer in the linked thread4c74356b41
What I was finding weird was that the currentMetricValue/desiredMetricValue was almost the same for all HPAs, but this one remained at 1. Indeed according to the formula there is no need to scale out the pods..Catalin
@Catalin as I see you have found the reason why your HPA is not scaling up/down your Deployments. As you found the reason please provide an answer to your question to help the community with similar issues. PS: my-app3 would scale to 2 replicas when the target value would be above 75%. my-app2 would scale to 1 replica when the target value would be less than 37,5%.Dawid Kruk

2 Answers

1
votes

Indeed from my research it seems that the HPA algorithm works in this way: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details

Do not know the reason why my-app3 was assigned one replica and the other two apps two replicas, but according to the algorithm it is not needed to scale out at this time.

0
votes

Posting this answer as it could be beneficiary for the community members on why exactly Horizontal Pod Autoscaler decided not to scale the amount of replicas in this particular setup.

The formula for amount of replicas workload will have is:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

Following on the describe of HPA:

NAMESPACE            NAME        REFERENCE              TARGETS            MINPODS   MAXPODS   REPLICAS   AGE
my-ns1               my-app1     Deployment/my-app1     49%/75%, 2%/75%    1         10        2          20h
my-ns2               my-app2     Deployment/my-app2     50%/75%, 10%/75%   1         10        2          22h
my-ns2               my-app3     Deployment/my-app3     47%/75%, 10%/75%   1         10        1          22h

HPA decides on the amount of replicas on the premise of their current amount.

A side note: In the setup that uses multiple metrics (for example CPU and RAM) it will use the higher metric and act accordingly.

Also please consider that downscaling has a cooldown.


Calculation on each of the Deployments

ceil[] - round a number up:

  • ceil(4,55) = 5
  • ceil(4,01) = 5

app1:

  • Replicas = ceil[2 * (49 / 75)]
  • Replicas = ceil[2 * 0,6533..]
  • Replicas = ceil[1,3066..]
  • Replicas = 2

This example shows that there will be no changes to be amount of replicas.

Amount of replicas would go:

  • Up when the currentMetricValue (49) would exceed the desiredMetricValue (75)
  • Down when the currentMetricValue (49) would be less than half of the desiredMetricValue (75)

app2 is in the same situation as app1 so it can be skipped

app3:

  • Replicas = ceil[1 * (49 / 75)]
  • Replicas = ceil[1 * 0,6266..]
  • Replicas = ceil[0,6266..]
  • Replicas = 1

This example also shows that there will be no changes to be amount of replicas.

Amount of replicas would go:

  • Up when the currentMetricValue (47) would exceed the desiredMetricValue (75)

Additional resources: