Horizontal pod Autoscaler scales custom metric too aggressively on GKE

Question

I have the below Horizontal Pod Autoscaller configuration on Google Kubernetes Engine to scale a deployment by a custom metric - RabbitMQ messages ready count for a specific queue: foo-queue.

It picks up the metric value correctly.

When inserting 2 messages it scales the deployment to the maximum 10 replicas. I expect it to scale to 2 replicas since the targetValue is 1 and there are 2 messages ready.

Why does it scale so aggressively?

HPA configuration:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: foo-hpa
  namespace: development
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: foo
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: External
    external:
      metricName: "custom.googleapis.com|rabbitmq_queue_messages_ready"
      metricSelector:
        matchLabels:
          metric.labels.queue: foo-queue
      targetValue: 1

Are you sure about targetValue: 1? Why this value is so small? I saw samples with recommended value above than 100 — Yasen
@Yasen When setting targetValue: 100 and having 2 messages in the queue the HPA scales to 2 pods, it seems to be very aggressive, I expect it to be 1 replica — Erez Ben Harush
Would you please read this guide by former Docker developer Jérôme Petazzoni: Kubernetes Deployments: The Ultimate Guide - Semaphore. It explains why in k8s there are two replicas and not one as in docker — Yasen

Erez Ben Harush Erez Ben Harush · Accepted Answer · 2019-09-11T12:54:52

According to https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

From the most basic perspective, the Horizontal Pod Autoscaler controller operates on the ratio between desired metric value and current metric value:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

From the above I understand that as long as the queue has messages the k8 HPA will continue to scale up since currentReplicas is part of the desiredReplicas calculation.

For example if:

currentReplicas = 1

currentMetricValue / desiredMetricValue = 2/1

then:

desiredReplicas = 2

If the metric stay the same in the next hpa cycle currentReplicas will become 2 and desiredReplicas will be raised to 4

Horizontal pod Autoscaler scales custom metric too aggressively on GKE

4 Answers