0
votes

I have a GKE cluster with n nodes (2 nodes with 1 CPU each in this example), running a stress test. I want them scaling without stopping running pods.

The cluster has autoscale enabled and a node pool with autoscale enabled.

After reaching about 50 pods the memory/CPU ends up, and the cluster starts creating a new node in a different node pool. Why not in the current pool?

After lanunching a new node the cluster crashes completely:

  • no one node is running;

  • some nodes are unschedulable with these warnings:

    "Cannot schedule pods: Insufficient memory."

    "Cannot schedule pods: node(s) had taints that the pod didn't tolerate." (I didn't set any taint though)

  • others are in Pending state.

What I want to achieve:

  • keep existing pods running without crashing;
  • get the new pods created and kept in Pending state until the new node is created;
  • the new node is created in the node pool with the instance template I have chosen.
2
Can you show yaml file?MikiBelavista

2 Answers

1
votes

It sounds like the behavior you are seeing is part of the new node auto provisioning feature of the cluster autoscaler. It automatically manages a list of node pools on your behalf, which is why you are seeing a node pool created.

If you just want the existing node pool to scale up / down, you should disable node auto provisioning and just set the autoscaling parameters of your node pool (min / max number of nodes) to have the autoscaler add new nodes with the existing instance template.

0
votes

I disabled the auto-provisioning and not the pool is scaling. But it keeps crashing all the pods while scaling.