0
votes

I keep getting this error when I try to setup liveness & readiness prob for my awx_web container

Liveness probe failed: Get http://POD_IP:8052/: dial tcp POD_IP:8052: connect: connection refused

Liveness & Readiness section in my deployment for the container awx_web

          ports:
          - name: http
            containerPort: 8052 # the port of the container awx_web
            protocol: TCP
          livenessProbe:
            httpGet:
              path: /
              port: 8052
            initialDelaySeconds: 5
            periodSeconds: 5
          readinessProbe:
            httpGet:
              path: /
              port: 8052
            initialDelaySeconds: 5
            periodSeconds: 5

if I test if the port 8052 is open or not from another pod in the same namespace as the pod that contains the container awx_web or if I test using a container deployed in the same pod as the container awx_web i get this (port is open)

/ # nc -vz POD_IP 8052
POD_IP  (POD_IP :8052) open

I get the same result (port 8052 is open) if I use netcat (nc) from the worker node where pod containing the container awx_web is deployed.

for info I use a NodePort service that redirect traffic to that container (awx_web)

type: NodePort
ports:
- name: http
  port: 80
  targetPort: 8052
  nodePort: 30100
2
If you do curl http://POD_IP:8052/ from another pod..does it work?Arghya Sadhu
From another pod, container in the same pod or from the worker node, yes it worksAbderrahmane
check kubelet and cni plugin pod logsArghya Sadhu
for kubelet log, it gave the same errorAbderrahmane

2 Answers

2
votes

I recreated your issue and it looks like your problem is caused by too small value of initialDelaySeconds for the liveness probe.

It takes more than 5s for awx container to open 8052 port. You need to wait a bit longer for it to start. I have found out that setting it to 15s is enough for me, but you may require some tweaking.

0
votes

Most likely your application couldnt startup or crash little after it start up . It may due to insufficient memory and cpu resource. Or one of the awx dependency not setup correctly like postgreslq & rabbit.

Did you check that if your application works correctly without probes? I recommend do that first. Examine the pods stats little bit to ensure its not restart.