0
votes

There was new kubernetes deployment created automatically with 1000 replicas. The same issue occurred a few months ago and then I deleted that kubernetes-cli deployment and pods. After a few months, again I faced this issue, how to prevent this, can anyone help on this.

Created Deployment => enter image description here

Here I monitored kubernetes events with "kubectl get events" command. Relevant events as bellow. enter image description here

CPU usage as bellow; enter image description here

PID information as bellow; enter image description here

Created pods; enter image description here

Pod information; enter image description here

2
`kubectl get events' gives some useful info?Nicola Ben
If you check the resource usage such as CPU, memory?Charles Xu
@NicolaBen, I attached relevant kubernetes events in description, please check.Sudharshan
@CharlesXu-MSFT, I attached resource usage in the description, please check.Sudharshan
@Sudharshan I means you can check the node with command kubectl describe node nodeName if you just have one node. I see your other pods are in Pending state, maybe the resource is exhausted.Charles Xu

2 Answers

0
votes

As stated in the following link:

At v1.11, Kubernetes supports clusters with up to 5000 nodes. More specifically, we support configurations that meet all of the following criteria:

No more than 5000 nodes
No more than 150000 total pods
No more than 300000 total containers
No more than 100 pods per node

Your event log shows an insufficient pods

If your cluster doesn't have more than 10 working nodes, the scheduler cannot schedule less than 100 pods per node (and respect the fourth bound).

The same limitations occur with a kubernetes version before v.1.11.

0
votes

This doesn't seem to be the case in your particular deployment, however something similar happened to me too and I'll post it in case someone Googles this question.

I had a failing job without a Delete policy, and this made it deploy itself 1000 times for some reason. Specifying a delete policy fixed the issue.