I am running a Kubernetes cluster v1.16(currently newest version on GKE) with HPA that scales the deployments base on custom metrics(Specifically rabbitmq messages count fetched from google cloud monitoring).
The Problem
The deployments scale up very fast to maximum pod count when the message count is temporarily high.
Information
The HPA --horizontal-pod-autoscaler-sync-period is set to 15 seconds on GKE and can't be changed as far as I know.
My custom metrics are updated every 30 seconds.
I believe that what causes this behavior is that when there is a high message count in the queues every 15 seconds the HPA triggers a scale up and after few cycles it reaches maximum pod capacity.
In kubernetes api v1.18 you can control scale up stabilization time, but I can't find a similar feature in v1.16.
My Question
How can I make the HPA scale up more gradually?
Edit 1
Sample HPA of one of my deployments:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: my-deployment-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-deployment
minReplicas: 6
maxReplicas: 100
metrics:
- type: External
external:
metricName: "custom.googleapis.com|rabbit_mq|v1-compare|messages_count"
metricSelector:
matchLabels:
metric.labels.name: production
targetValue: 500