Starting from Kubernetes v1.18 the v2beta2 API allows scaling behavior to be configured through the Horizontal Pod Autoscalar (HPA) behavior field. I'm planning to apply HPA with custom metrics to a StatefulSet.
The use case I'm looking at is scaling out using a custom metric (e.g. number of user sessions on my application), but the HPA will not scale down at all. This use case is also described by K8s SIG-Autoscaling enhancements - "Configurable scale velocity for HPA >> Story 4: Scale Up As Usual, Do Not Scale Down".
behavior:
scaleDown:
policies:
- type: pods
value: 0
The user sessions could stay active for minutes to hours. Starting with 1 replica of the StatefulSet, as the number of user sessions hit an upper limit (exposed using Prometheus collector and later configured using HPA custom metric option), the application pods will scale-out. The new pods will start serving new users.
Since this is a StatefulSet and cannot just abruptly scale down, I'm seeking help on ways to scale down when the user sessions on the new replicas go down to 0. The above link says that the scale down can be controlled by a separate process. Not sure how to do this? Looking for some pointers.
Thanks.