nginx-ingress failing intermittely

Question

NGINX Ingress controller version: 0.22.0

Image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.22.0

Image ID: docker-pullable://quay.io/kubernetes-ingress-controller/nginx-ingress-controller@sha256:47ef793dc8dfcbf73c9dee4abfb87afa3aa8554c35461635f6539c6cc5073b2c

quay.io/kubernetes-ingress-controller/nginx-ingress-controller

Kubernetes version (use kubectl version): v1.15.3

Environment:

Cloud provider or hardware configuration: Vm's in Vcenter

OS (e.g. from /etc/os-release): VERSION="16.04.6 LTS (Xenial Xerus)"

Kernel (e.g. uname -a): Linux appsec-ana01 4.4.0-143-generic #169-Ubuntu SMP Thu Feb 7 07:56:38 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Install tools:

Others:

What happened: my nginx-ingress-controller is failing and restarting inconsistently..

What you expected to happen: No Restart to happen

How to reproduce it (as minimally and precisely as possible): Same Version issue should occur.

curl -I http://10.244.10.48:10254/healthz
HTTP/1.1 200 OK
Date: Wed, 11 Sep 2019 11:15:56 GMT
Content-Length: 2
Content-Type: text/plain; charset=utf-8

Anything else we need to know:

kubectl get events results in following

2m Warning Unhealthy pod/nginx-ingress-controller-7cfb747d6c-4n4nz Liveness probe failed: Get http://10.244.10.48:10254/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
30s Warning Unhealthy pod/nginx-ingress-controller-7cfb747d6c-4n4nz Liveness probe failed: Get http://10.244.10.48:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
6m5s Warning Unhealthy pod/nginx-ingress-controller-7cfb747d6c-4n4nz Readiness probe failed: Get http://10.244.10.48:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
35m Warning Unhealthy pod/nginx-ingress-controller-7cfb747d6c-4n4nz Readiness probe failed: Get http://10.244.10.48:10254/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

curl -I http://10.244.10.48:10254/healthz
HTTP/1.1 200 OK
Date: Wed, 11 Sep 2019 11:15:56 GMT
Content-Length: 2
Content-Type: text/plain; charset=utf-8

This question can't be read. Please format your post: stackoverflow.com/help/formatting — Amit Kumar Gupta

Nick_Kh Nick_Kh · Accepted Answer · 2019-09-12T10:51:16

The errors in you logs clearly say about health check issues with a particular nginx-ingress-controller* Pod. Due to the fact that Readiness and Liveness probes are fully managed by kubelet node agent, I would check carefully kubelet service, in order to fetch any related errors or suspicious events:

$ sudo systemctl status kubelet -l

$ sudo journalctl -u kubelet

Therefore, the issue may occurs with a connectivity from kubelet node to K8s API server, despite your successful efforts testing curl requests targeting /health endpoint.

Meanwhile, you can check logs from Nginx Ingress controller's Pod and inspect bootstrap events:

kubectl logs $(kubectl get po -l app=nginx-ingress -o jsonpath='{.items[0].metadata.name}')

I would also check the K8s Node capacity where origin nginx-ingress-controller* Pod resides on and observe overall resource utilization for allocated objects across the cluster.

I recommend you to review the official Nginx Ingress-Controller troubleshooting document to get more ways for investigation.

nginx-ingress failing intermittely

2 Answers