How can kubernetes autoscale using HPA with metric server?

Question

I am very interesed in testing kubernete auto-scale solution in Ubuntu installation. I already used it in minikube, with heapster, but since it is deprecated already, I tried to use metric server. Now in my Ubuntu, I installed metrics-server like below:

kube-system      kube-apiserver-kmaster                  1/1     Running   1          11d
kube-system      kube-controller-manager-kmaster         1/1     Running   1          11d
kube-system      kube-proxy-47k6b                        1/1     Running   0          11d
kube-system      kube-proxy-q8zdw                        1/1     Running   1          11d
kube-system      kube-scheduler-kmaster                  1/1     Running   1          11d
kube-system      kubernetes-dashboard-5f7b999d65-6wl6k   1/1     Running   1          11d
kube-system      metrics-server-548456b4cd-wxc9b         1/1     Running   0          3d18h
metallb-system   controller-cd8657667-ckpn6              1/1     Running   0          8d
metallb-system   speaker-m9599

But when I check HPA I always saw below:

Kubectl get hpa

NAME         REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
api-server   Deployment/api-server   <unknown>/50%   1         10        3          3d19h
ngsc         Deployment/ngsc         <unknown>/50%   1         10        3          3d19h

Seemed metric service is not used for calculating the usage.

I went to Kubernetes doc site, and really can not figure out how to config the utilization for the metric-server so that Kubernetes do the auto-scale.

I describe the auto-scale:

                        api-server
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Fri, 03 May 2019 05:49:07 +0000
Reference:                                             Deployment/api-server
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 50%
Min replicas:                                          1
Max replicas:                                          10
Deployment pods:                                       3 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Events:
  Type     Reason                   Age                        From                       Message
  ----     ------                   ----                       ----                       -------
  Warning  FailedGetResourceMetric  4m48s (x22069 over 3d20h)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API

Describe for deployment:

Pod Template:
  Labels:  app=api-server
  Containers:
   api-server:
    Image:      xxxxxx
    Port:       <none>
    Host Port:  <none>
    Limits:
      cpu:  500m
    Requests:
      cpu:        200m
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>

This means the deployment has the resource cofiguration. But still hpa shows unknown

Add memory, now describe is:

 Limits:
      cpu:     500m
      memory:  1Gi
    Requests:
      cpu:        500m
      memory:     512Mi

But kubectl get hpa is still unknown.

Checking logs for the metrics-server:

 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:kmaster: unable to fetch metrics from Kubelet kmaster (kmaster): Get https://kmaster:10250/stats/summary/: dial tcp: lookup kmaster on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:knode: unable to fetch metrics from Kubelet knode (knode): Get https://knode:10250/stats/summary/: dial tcp: lookup knode on 10.96.0.10:53: no such host]
E0507 05:20:23.797590       1 reststorage.go:148] unable to fetch pod metrics for pod default/api-server-777b78ccf5-mlt94: no metrics known for pod
E0507 05:20:23.797614       1 reststorage.go:148] unable to fetch pod metrics for pod default/api-server-777b78ccf5-r66bw: no metrics known for pod

And when

curl -k https://knode:10250/stats/summary/`

I got this error:

Unauthorized

PjoterS PjoterS · Accepted Answer · 2019-06-18T15:52:55

Based on information you provided.

As you have pod metrics-server-548456b4cd-wxc9b it's mean that metric-server is enabled. Also as you have 3 replicas i assume this number was provided in Deployment manifest.

HPA might not scaled your deployment due to:

1) Lack of resources

$ kubectl describe node
...
 Namespace                  Name                                 CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                                 ------------  ----------  ---------------  -------------  ---
  default                    nginx-deployment-5ffb677f99-k5mdj    200m (10%)    500m (25%)  0 (0%)           0 (0%)         6m55s
  default                    nginx-deployment-5ffb677f99-n7t7n    200m (10%)    500m (25%)  0 (0%)           0 (0%)         6m55s
  default                    nginx-deployment-5ffb677f99-pw2g7    200m (10%)    500m (25%)  0 (0%)           0 (0%)         6m55s
  kube-system                etcd-minikube                        0 (0%)        0 (0%)      0 (0%)           0 (0%)         152m
  kube-system                kube-addon-manager-minikube          5m (0%)       0 (0%)      50Mi (0%)        0 (0%)         152m
  kube-system                kube-apiserver-minikube              250m (12%)    0 (0%)      0 (0%)           0 (0%)         152m
  kube-system                kube-controller-manager-minikube     200m (10%)    0 (0%)      0 (0%)           0 (0%)         152m
  kube-system                kube-dns-6bfbdd666c-l74lx            260m (13%)    0 (0%)      110Mi (1%)       170Mi (2%)     32m
  kube-system                kube-proxy-dnh4m                     0 (0%)        0 (0%)      0 (0%)           0 (0%)         153m
  kube-system                kube-scheduler-minikube              100m (5%)     0 (0%)      0 (0%)           0 (0%)         152m
  kube-system                metrics-server-77fddcc57b-mjlf5      0 (0%)        0 (0%)      0 (0%)           0 (0%)         147m
  kube-system                storage-provisioner                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         153m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests     Limits
  --------           --------     ------
  cpu                1415m (70%)  1500m (75%)
  memory             160Mi (2%)   170Mi (2%)
  ephemeral-storage  0 (0%)       0 (0%)

As you see on the example, minikube resources and 3 pods with nginx already requested 70% of the CPU. In your manifest each container will request cpu: 200m so this deployment can create only 2 more pods. Others pods will be on Pending state due to lack of CPU resources.

2) Lack of CPU Load

Error message like the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API means that metric-server did not receive any metrics so pods did not generate any load.

I assume you scaled deployment using command

$ kubectl autoscale deployment api-server --cpu-percent=50 --min=1 --max=10
...
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Events:
  Type     Reason                        Age   From                       Message
  ----     ------                        ----  ----                       -------
  Warning  FailedGetResourceMetric       9s    horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  9s    horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API

Try generate some CPU load by entering one of the deployments pod

$ kubectl exec -ti <yourPodName> sh

$ while true; do echo 'IncreaseLoad'; done
IncreaseLoad
IncreaseLoad
IncreaseLoad
...

You can also use stress command.

After a while HPA should get metrics and change from to correct value.

Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type     Reason                        Age                From                       Message
  ----     ------                        ----               ----                       -------
  Warning  FailedGetResourceMetric       14m (x6 over 16m)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  14m (x6 over 16m)  horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Normal   SuccessfulRescale             6m54s              horizontal-pod-autoscaler  New size: 2; reason: All metrics below target
  Normal   SuccessfulRescale             50s                horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target

If this didnt help please provide your HPA and Deployment manifests.

How can kubernetes autoscale using HPA with metric server?

3 Answers