Pods stuck in ContainerCreating with “failed to assign an IP address to container”

Question

Multiple pods of a 600 pod deployment stuck in ContainerCreating after a rolling update with the message:

Failed create pod sandbox: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod network: add cmd: failed to assign an IP address to container

What I have tried:

Upgraded to v1.12 on EKS and CNI 1.5.0. This issue was closed stating CNI 1.5.0 solved the issue. It did not for us. In another thread leaking ENIs was blamed but was also closed due to CNI upgrade.
Installed cni-metrics-helper and this is a snapshot of the output:

maxIPAddresses, value: 759.000000
ipamdActionInProgress, value: 1.000000
addReqCount, value: 16093.000000
awsAPILatency, value: 564.000000
delReqCount, value: 32337.000000
eniMaxAvailable, value: 69.000000
assignIPAddresses, value: 558.000000
totalIPAddresses, value: 682.000000
eniAllocated, value: 69.000000

Do the CNI metrics output suggest there's an issue? Seems like there are enough IPs.

What else can I try to debug?

The only thing i see is that all ENI have been allocated and nothing is left, though IPs are still available. — Tarun Lalwani

Jakub Bujny Jakub Bujny · Accepted Answer · 2019-08-02T10:33:38

It seems that you reached maximum number of IP addresses in your subnet what can suggest such thing in documentation:

maxIPAddress: the maximum number of IP addresses that can be used for Pods in the cluster. (assumes there is enough IPs in the subnet).

Please take a look also on maxUnavailable and maxSurge parameters which controls how many PODs appear during rolling upgrade - maybe your configuration assumes that during rolling upgrade you will have over 600 PODs (like 130%) and that hit limits of your AWS network.

Pods stuck in ContainerCreating with “failed to assign an IP address to container”

1 Answers