How to fix Kubernetes Ingress Controller cutting off nodes from cluster

Question

I'm having some trouble installing an Ingress Controller in my on-prem cluster (created with Kubespray, running MetalLB to create LoadBalancer.).

I tried using nginx, traefik and kong but all got the same results.

I'm installing my the nginx helm chart using the following values.yaml:

controller:
  kind: DaemonSet
  nodeSelector:
    node-role.kubernetes.io/master: ""
  image:
    tag: 0.23.0
rbac:
  create: true

With command:

helm install --name nginx stable/nginx-ingress --values values.yaml --namespace ingress-nginx

When I deploy the ingress controller in the cluster, a service is created (e.g. nginx-ingress-controller for nginx). This service is of the type LoadBalancer and gets an external IP.

When this external IP is assigned, the node that's linked to this external IP is lost (status Not Ready). However, when I check this node, it's still running, it's just cut off from the other nodes, it can't even ping them (No route found). When I remove the service (not the rest of the nginx helm chart), everything works and the Ingress works. I also tried installing nginx/traefik/kong without a LoadBalancer using NodePorts or External IPs on the service, but I get the same result.

Does anyone recognize this behaviour? Why does the ingress still work, even when I remove the nginx-ingress-controller service?

Can you please elaborate this - "node that's linked to this external IP is lost " Do you mean, that node and ingress service attempting to assign the same public IP? — A_Suh
Hi @A_Suh, thanks for your response! The external IP for the service is the IP of one of the 5 nodes in my cluster. Let's call that node X. When the service is created and gets an external IP, X gets status "Not Ready". However, X is not down, since I can still log in to it and kubelet is still running. The moment the service is installed in my cluster, X can't access the master node anymore so its health pings can't reach master anymore. When I ping to the master node (or any other node) from X, I get "Destination Host Unreachable". — Nils Lamot
this is weird your DHCP server is assigning IP to the service, which has been already assigned to the node. Would you try to manually set a static IP address to your ingress service? i.e. apiVersion: v1 kind: Service spec: type: LoadBalancer loadBalancerIP: 10.10.10.10 — A_Suh
oh, you were right, for MetalLB you have to provide IP addresses of the nodes. Did you manage to solve the issue? — A_Suh
Hi @A_Suh, I just managed to solve this issue. It turns out you were right and I did need to specify IPs outside of the cluster to fix this. The reason why this didn't work at first, was because DHCP wasn't configured in my network. Thanks for your help! — Nils Lamot

Nils Lamot Nils Lamot · Accepted Answer · 2019-04-16T11:34:23

After a long search, we finally found a working solution for this problem.

As mentioned by @A_Suh, the pool of IPs that metallb uses, should contain IPs that are currently not used by one of the nodes in the cluster. By adding a new IP range that's also configured in the DHCP server, metallb can use ARP to link one of the IPs to one of the nodes.

For example in my 5 node cluster (kube11-15): When metallb gets the range 10.4.5.200/31 and allocates 10.4.5.200 for my nginx-ingress-controller, 10.4.5.200 is linked to kube12. On ARP requests for 10.4.5.200, all 5 nodes respond with kube12 and trafic will be routed to this node.

How to fix Kubernetes Ingress Controller cutting off nodes from cluster

1 Answers