The docs says:
For per-pod resource metrics (like CPU), the controller fetches the metrics from the resource metrics API for each Pod targeted by the HorizontalPodAutoscaler. Then, if a target utilization value is set, the controller calculates the utilization value as a percentage of the equivalent resource request on the containers in each Pod. If a target raw value is set, the raw metric values are used directly. The controller then takes the mean of the utilization or the raw value (depending on the type of target specified) across all targeted Pods, and produces a ratio used to scale the number of desired replicas.
Assume I have a Pod with:
resources:
limits:
cpu: "0.3"
memory: 500M
requests:
cpu: "0.01"
memory: 40M
and now I have an autoscaling definition as:
type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
Which according to the docs:
With this metric the HPA controller will keep the average utilization of the pods in the scaling target at 60%. Utilization is the ratio between the current usage of resource to the requested resources of the pod
So, I'm not understanding something here. If request
is the minimum resources required to run the app, how would scaling be based on this value? 60% of 0.01 is nothing, and the service would be constantly scaling.