1
votes

Imagine this hypothetical situation (that just bit me in practice):

  1. All worker instances in a Kubernetes cluster die (say due to a spot price fluctuations), and a new one comes back automatically.
  2. The scheduler then attempts to schedule pods onto the node in some arbitrary order but they can't all fit because the number of nodes is smaller than before.
  3. All default namespace pods make it on but the kube-system namespace DNS pod doesn't
  4. Now most everything trying to run on the cluster is hung because there's no DNS on the cluster.

Is there any way to use the QoS tiers in Kubernetes to get the scheduler to proritize scheduling the kube-system pods before the other namespaces? Or is there some other way I should be fixing this problem?

1

1 Answers

1
votes

This is a real problem, and Kubernetes doesn't have Pod QoS guarantees yet.

To be completely safe, your cluster should be big enough to handle any expected cluster shrinkage, but that's not always practical.

At the moment, manually shrinking the competing, lower-priority deployments would probably be the easiest way to get a cluster back working.

There is work being done trying to get Pod QoS policies into Kubernetes. You can follow along/chime in on https://github.com/kubernetes/kubernetes/pull/14943