I'm having troubles with my Kubernetes ingress on GKE. I'm simulating termination of a preemptible instance by manually deleting it (through the GCP dashboard). I am running a regional GKE cluster (one VM in each avaibility zone in us-west1).
A few seconds after selecting delete on only one of the VMs I start receiving 502 errors through the load balancer. Stackdriver logs for the load balancer list the error as failed_to_connect_to_backend.
Monitoring the health of backend-service shows the backend being terminated go from HEALTHY to UNHEALTHY and then disappearing while the other two backends remain HEALTHY.
After a few seconds requests begin to succeed again.
I'm quite confused why the load balancer is unable to direct traffic to the healthy nodes while one goes down - or maybe this is a kubernetes issue? Could the load balancer be properly routing traffic to a healthy instance, but the kubernetes NodePort service on that instance proxies the request back to the unhealthy instance for some reason?