0
votes

I am trying to setup Horizontal Pod Autoscaler to automatically scale up and down my api server pods based on CPU usage.

I currently have 12 pods running for my API but they are using ~0% CPU.

kubectl get pods
NAME                                       READY   STATUS    RESTARTS   AGE
api-server-deployment-578f8d8649-4cbtc     2/2     Running   2          12h
api-server-deployment-578f8d8649-8cv77     2/2     Running   2          12h
api-server-deployment-578f8d8649-c8tv2     2/2     Running   1          12h
api-server-deployment-578f8d8649-d8c6r     2/2     Running   2          12h
api-server-deployment-578f8d8649-lvbgn     2/2     Running   1          12h
api-server-deployment-578f8d8649-lzjmj     2/2     Running   2          12h
api-server-deployment-578f8d8649-nztck     2/2     Running   1          12h
api-server-deployment-578f8d8649-q25xb     2/2     Running   2          12h
api-server-deployment-578f8d8649-tx75t     2/2     Running   1          12h
api-server-deployment-578f8d8649-wbzzh     2/2     Running   2          12h
api-server-deployment-578f8d8649-wtddv     2/2     Running   1          12h
api-server-deployment-578f8d8649-x95gq     2/2     Running   2          12h
model-server-deployment-76d466dffc-4g2nd   1/1     Running   0          23h
model-server-deployment-76d466dffc-9pqw5   1/1     Running   0          23h
model-server-deployment-76d466dffc-d29fx   1/1     Running   0          23h
model-server-deployment-76d466dffc-frrgn   1/1     Running   0          23h
model-server-deployment-76d466dffc-sfh45   1/1     Running   0          23h
model-server-deployment-76d466dffc-w2hqj   1/1     Running   0          23h

My api_hpa.yaml looks like:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server-deployment
  minReplicas: 4
  maxReplicas: 12
  targetCPUUtilizationPercentage: 50

It has now been 24h and HPA has still not scaled down my pods to 4 even though the saw no CPU usage.

When I look at the GKE Deployment details dashboard I see the warning Unable to read all metrics

Is this causing autoscaler to not scale down my pods?

And how do I fix it?

It is my understanding that GKE runs a metrics server automatically:

kubectl get deployment --namespace=kube-system
NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE
event-exporter-gke                         1/1     1            1           18d
kube-dns                                   2/2     2            2           18d
kube-dns-autoscaler                        1/1     1            1           18d
l7-default-backend                         1/1     1            1           18d
metrics-server-v0.3.6                      1/1     1            1           18d
stackdriver-metadata-agent-cluster-level   1/1     1            1           18d

Here is the configuration of that metrics server:

Name:                   metrics-server-v0.3.6
Namespace:              kube-system
CreationTimestamp:      Sun, 21 Feb 2021 11:20:55 -0800
Labels:                 addonmanager.kubernetes.io/mode=Reconcile
                        k8s-app=metrics-server
                        kubernetes.io/cluster-service=true
                        version=v0.3.6
Annotations:            deployment.kubernetes.io/revision: 14
Selector:               k8s-app=metrics-server,version=v0.3.6
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:           k8s-app=metrics-server
                    version=v0.3.6
  Annotations:      seccomp.security.alpha.kubernetes.io/pod: docker/default
  Service Account:  metrics-server
  Containers:
   metrics-server:
    Image:      k8s.gcr.io/metrics-server-amd64:v0.3.6
    Port:       443/TCP
    Host Port:  0/TCP
    Command:
      /metrics-server
      --metric-resolution=30s
      --kubelet-port=10255
      --deprecated-kubelet-completely-insecure=true
      --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
    Limits:
      cpu:     48m
      memory:  95Mi
    Requests:
      cpu:        48m
      memory:     95Mi
    Environment:  <none>
    Mounts:       <none>
   metrics-server-nanny:
    Image:      gke.gcr.io/addon-resizer:1.8.10-gke.0
    Port:       <none>
    Host Port:  <none>
    Command:
      /pod_nanny
      --config-dir=/etc/config
      --cpu=40m
      --extra-cpu=0.5m
      --memory=35Mi
      --extra-memory=4Mi
      --threshold=5
      --deployment=metrics-server-v0.3.6
      --container=metrics-server
      --poll-period=300000
      --estimator=exponential
      --scale-down-delay=24h
      --minClusterSize=5
      --use-metrics=true
    Limits:
      cpu:     100m
      memory:  300Mi
    Requests:
      cpu:     5m
      memory:  50Mi
    Environment:
      MY_POD_NAME:        (v1:metadata.name)
      MY_POD_NAMESPACE:   (v1:metadata.namespace)
    Mounts:
      /etc/config from metrics-server-config-volume (rw)
  Volumes:
   metrics-server-config-volume:
    Type:               ConfigMap (a volume populated by a ConfigMap)
    Name:               metrics-server-config
    Optional:           false
  Priority Class Name:  system-cluster-critical
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   metrics-server-v0.3.6-787886f769 (1/1 replicas created)
Events:
  Type    Reason             Age                    From                   Message
  ----    ------             ----                   ----                   -------
  Normal  ScalingReplicaSet  3m10s (x2 over 5m39s)  deployment-controller  Scaled up replica set metrics-server-v0.3.6-7c9d64c44 to 1
  Normal  ScalingReplicaSet  2m54s (x2 over 5m23s)  deployment-controller  Scaled down replica set metrics-server-v0.3.6-787886f769 to 0
  Normal  ScalingReplicaSet  2m50s (x2 over 4m49s)  deployment-controller  Scaled up replica set metrics-server-v0.3.6-787886f769 to 1
  Normal  ScalingReplicaSet  2m33s (x2 over 4m34s)  deployment-controller  Scaled down replica set metrics-server-v0.3.6-7c9d64c44 to 0

Edit: 2021-03-13

This is the configuration for the api server deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server-deployment
spec:
  replicas: 12
  selector:

    matchLabels:
      app: api-server
  template:
    metadata:
      labels:
        app: api-server
    spec:
      serviceAccountName: api-kubernetes-service-account
      nodeSelector:
        #<labelname>:value
        cloud.google.com/gke-nodepool: api-nodepool
      containers:
      - name: api-server
        image: gcr.io/questions-279902/taskserver:latest
        imagePullPolicy: "Always"
        ports: 
        - containerPort: 80
        #- containerPort: 443
        args:
        - --disable_https
        - --db_ip_address=127.0.0.1
        - --modelserver_address=http://10.128.0.18:8501 # kubectl get service model-service --output yaml
        resources:
          # You must specify requests for CPU to autoscale
          # based on CPU utilization
          requests:
            cpu: "250m"
      - name: cloud-sql-proxy
...
1

1 Answers

1
votes

I don’t see any “resources:” fields (e.g. cpu, mem, etc.) assigned, and this should be the root cause. Please be aware that having resource(s) set on a HPA (Horizontal Pod Autoscaler) is a requirement, explained on official Kubernetes documentation

Please note that if some of the Pod's containers do not have the relevant resource request set, CPU utilization for the Pod will not be defined and the autoscaler will not take any action for that metric.

This can definitely cause the message unable to read all metrics on target Deployment(s).