3
votes

I have a java application running on K8s with min:2 and max:6 pods in a deployment. heap min:256Mb, heap max: 512Mb. request and limit memory is 1Gi Here is the hpa spec:

  spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 6
  metrics:
    - type: Resource
      resource:
        name: cpu
        targetAverageUtilization: 60
    - type: Resource
      resource:
        name: memory
        targetAverageUtilization: 60

During the performance test, I have noticed that the deployment is trying to scale up very aggressively.

When there is no load, the memory utilization is around 33%, and according to this link https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ the formula to get a rough idea of the desired pods is desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

From the K8s monitoring, I have noticed that it tries to scale up when memory utilization increase to around 40%. if i understand correctly how the above formula works, desiredReplicas = ceil[2*(0.4/0.6)] = 2, then it should not scale up.

Do I understand it correctly?

1
are your workflows going on properly ? Or if there is any drop in the transactions being done during some operation, try checking the applications logs too, at the same time when you see this behaviour of HPA. It happened with me also, scenario was like generating a bulky report after and API call, but the size of PDF when was reaching a threshold, the call was dropping and new PODs started spawning up, I increased the memory and cpu , then it got sortedTushar Mahajan
this was one of the practical cases that I felt, and while using Azure kubernetes cluster service, I was never correctly able to understand the scaling up, many times the scale up happened and even when the load got reduced it took like 10 mins from then to get restored to original desired replica state. You can also check for this observation tooTushar Mahajan

1 Answers

2
votes

That looks correct but I'm taking a while guess because you didn't share the output of kubectl top pods. It could be that your deployment is not scaling because of memory utilization but because of CPU utilization first.

If you see the docs the first metrics that hits the target starts the autoscaling process:

Kubernetes 1.6 adds support for scaling based on multiple metrics. You can use the autoscaling/v2beta2 API version to specify multiple metrics for the Horizontal Pod Autoscaler to scale on. Then, the Horizontal Pod Autoscaler controller will evaluate each metric, and propose a new scale based on that metric. The largest of the proposed scales will be used as the new scale

You could also try a Value metric for your memory target to troubleshoot:

  metrics:
    - type: Resource
      resource:
        name: cpu
        targetAverageUtilization: 60
    - type: Resource
      resource:
        name: memory
        targetAverageValue: 700M

A good way to see the current metrics is to just get the status on a full output on the HPA:

$ kubectl get hpa <hpa-name> -o=yaml