Prevent K8S HPA from deleting pod after load is reduced

Question

I have sidekiq custom metrics coming from prometheus adapter. Using thoes queue metrics from prometheus i have setup HPA. When jobs in queue in sidekiq goes above say 1000 jobs HPA triggers 10 new pods. Then each pod will execute 100 jobs in queue. When jobs are reduced to say 400. HPA will scale-down. But when scale-down happens, hpa kills pods say 4 pods are killed. Thoes 4 pods were still running jobs say each pod was running 30-50 jobs. Now when hpa deletes these 4 pods, jobs running on them are also terminated. And thoes jobs are marked as failed in sidekiq.

So what i want to achieve is stop hpa from deleting pods which are executing the jobs. Moreover i want hpa to not scale-down even after load is reduced to minimum, instead delete pods when jobs in queue in sidekiq metrics is 0.

Is there any way to achieve this?

That will utilise my resources. I have minimum scale is 1 and max is 10. if set minimum to 10, then it will use more of the resources — Hb_1993
So basically can I assume you don't want HPA to delete pods when its in processing ? — damitj07
I can suggest (with a pinch of salt ) a pre-stop hook to delay the eviction, where the delay is based on operation time or some logic to know if the procession is done or not. ref - blog.gruntwork.io/… — damitj07

prometherion prometherion · Accepted Answer · 2019-12-30T08:50:02

Weird usage, honestly: you're wasting resources even your traffic is on the cool-down phase but since you didn't provide further details, here it is.

Actually, it's not possible to achieve what you desire since the common behavior is to support a growing load against your workload. The unique wait to achieve this (and this is not recommended) is to change the horizontal-pod-autoscaler-downscale-stabilization Kubernetes Controller Manager's flag to a higher value.

JFI, the doc warns you:

Note: When tuning these parameter values, a cluster operator should be aware of the possible consequences. If the delay (cooldown) value is set too long, there could be complaints that the Horizontal Pod Autoscaler is not responsive to workload changes. However, if the delay value is set too short, the scale of the replicas set may keep thrashing as usual.

Prevent K8S HPA from deleting pod after load is reduced

2 Answers