2
votes

Similar question on SO has 10 answers as 'force delete the pod' -_-

Of course this is unacceptable as it causes problems on the cluster - too many pods are stuck on 'terminating', and many times if you try to delete a random pod it also gets stuck. It happens fairly randomly.

So how to determine, first why are 'termination' commands issued and second how to find the culprit behind the freezes.

Is it the CNI? Core components like kubelet, controllermanager?

Logs don't show anything useful, nor does 'describe pod'.

1

1 Answers

1
votes

If your pods got terminated with apparently no cause, it could be:

  • the node is under stress (memory, cpu)
  • liveness condition is not respected

For these reasons, the scheduler kills some pods.

How to determine the precise cause? If you found 'logs' and 'describe' command useless, it could be useful a monitoring system (ex. influxdb+grafana: https://github.com/kubernetes/heapster/tree/master/deploy/kube-config/influxdb).