Auto-provisioned node pool is not getting cleaned up

Question

I have a Kubernetes cluster with auto-provisioning enabled on GKE.

gcloud beta container clusters create "some-name" --zone "us-central1-a" \
  --no-enable-basic-auth --cluster-version "1.13.11-gke.14" \
  --machine-type "n1-standard-1" --image-type "COS" \
  --disk-type "pd-standard" --disk-size "100" \
  --metadata disable-legacy-endpoints=true \
  --scopes "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" \
  --num-nodes "1" --enable-stackdriver-kubernetes --enable-ip-alias \
  --network "projects/default-project/global/networks/default" \
  --subnetwork "projects/default-project/regions/us-central1/subnetworks/default" \
  --default-max-pods-per-node "110" \
  --enable-autoscaling --min-nodes "0" --max-nodes "8" \
  --addons HorizontalPodAutoscaling,KubernetesDashboard \
  --enable-autoupgrade --enable-autorepair \
  --enable-autoprovisioning --min-cpu 1 --max-cpu 40 --min-memory 1 --max-memory 64

I ran a deployment which wouldn't fit on the existing node (which has 1 CPU).

kubectl run say-lol --image ubuntu:18.04 --requests cpu=4 -- bash -c 'echo lolol && sleep 30'

The auto-provisioner correctly detected that a new node pool was needed, and it created a new cluster and started running the new deployment. However, it was not able to delete it after it was no longer needed.

kubectl delete deployment say-lol

After all pods are gone, the new cluster has been sitting idle for more than 20 hours.

$ kubectl get nodes
NAME                                                  STATUS   ROLES    AGE   VERSION
gke-some-name-default-pool-5003d6ff-pd1p        Ready    <none>   21h   v1.13.11-gke.14
gke-some-name-nap-n1-highcpu-8--585d94be-vbxw   Ready    <none>   21h   v1.13.11-gke.14

$ kubectl get deployments
No resources found in default namespace.

$ kubectl get events
No resources found in default namespace.

Why isn't it cleaning up the expensive node pool?

dany L dany L · Accepted Answer · 2019-11-21T17:21:23

I was reproducing on my two clusters and found out that culprit was highly related to the kube-dns pod. On cluster 1, for scaled up node, there was no kube-dns pod and scale down occurred after deleting say-lol. On cluster 2, because of the kube-dns pod, the secondary node did not scale down.

Following this doc/How to set PDBs to enable CA to move kube-system pods?

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: kube-dns-pdb
  namespace: kube-system
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      k8s-app: kube-dns

I created a pdb to allow disruption of the kube-dns pod thus allowing downscaling. You can check if disruptions are allowed by running

kubectl get pdb -n kube-system

Allowed disruptions should have a non zero value for the process to work.

NAME           MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
kube-dns-pdb   N/A             1                 1                     28m

Auto-provisioned node pool is not getting cleaned up

3 Answers