I have a java application running on K8s with min:2 and max:6 pods in a deployment. heap min:256Mb, heap max: 512Mb. request and limit memory is 1Gi Here is the hpa spec:
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 6
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 60
- type: Resource
resource:
name: memory
targetAverageUtilization: 60
During the performance test, I have noticed that the deployment is trying to scale up very aggressively.
When there is no load, the memory utilization is around 33%, and according to this link https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
the formula to get a rough idea of the desired pods is desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
From the K8s monitoring, I have noticed that it tries to scale up when memory utilization increase to around 40%. if i understand correctly how the above formula works, desiredReplicas = ceil[2*(0.4/0.6)] = 2
, then it should not scale up.
Do I understand it correctly?