3
votes

After trying with various providers(bare kubernetes, openshift, aws eks) we have found that even if node has enough resources(cpu, ram, hdd) after reaching ~110 pods new pods are hanging in Pending state without any events or errors except the event

"Successfully assigned {namespace}/{pod_name} to {node_name}"

We have tried to search for any related logs in kubelet, scheduler, etc - but there is nothing except this event mentioned earlier.

Did someone succeed in running more than 110 pods per node? What are we doing wrong?

The only thing worth mentioning additionally is that in our case it is not 110 replicas of same pod but a 110 various pods from various deployments/daemon sets. And of course we have tweaked node pod_limit > 110.

2

2 Answers

2
votes

While the current scaling target is 500 pods/node (see https://github.com/kubernetes/community/blob/master/sig-scalability/goals.md), depending on how many total nodes you are talking about, you might be in the territory where the default scheduler settings are no longer helpful. Unfortunately scheduler tuning is a bit of a dark art, I would recommend asking for help in the sig-scaling Slack channel.

2
votes

kunernetes supports 110 pods per node. There are requests coming from multiple channels to increase the pods per node.

There is PR raised to support 500 pods per node. it is still open though. you can track the status at the below link

PR to support 500 pods per node