0
votes

My development environment worked well all of 2019, but when entering 2020 one of my services was not working, when looking at the detail what causes the failure is that the health check indicates that the VM is not healthy.

The scheme is simple, my WCF service runs on IIS andI use a TCP load balancer. This same scheme is replicated to have access by two different IPs but only one of the health checks fails. The service is available from localhost, it is available from the ephemeral IP of the instance, but it fails if I access from the static IP that I assigned to the load balancer.

Since it is the same application running in the two VMs and all the configuration is similar, I want to validate what makes the health check to fail, checking with wireshark the connection to the enabled port, I find that all connections from the load balancer say TCP Retransmission but the connections to the ephemeral IP arrive well and receive the 200 OK.

My knowledge about this topic is minimal and I can not identify information that is useful to solve the problem, taking into account that everything is working well last year.

I want to see what information stackdriver can give me, but for some reason I can't find the TCP load balancers on the list.

TCP Load Balancer

stackdriver

EDIT: Apparently the problem was due to the fact that the "Google Compute Engine Agent" service was not running, but I couldn't see any log indicating what caused the service to stop. When starting the service again everything worked normally.

1
What errors/warnings appear in Stackdriver? Stackdriver is a great resource for debugging.John Hanley
In stackdriver I do not see any failure, the VM appears without problems, but I can not find the load balancer in the list, TCP load balancers do not appear in stackdriver?Ricardo Alfonso Ortega Jaimes
Which load balancer are you using? Google has many.John Hanley
I included the images in the postRicardo Alfonso Ortega Jaimes

1 Answers

1
votes

Please note that Google Cloud Platform provides health checking mechanisms that determine if VM instances properly respond to traffic. Health checks and load balancers work together.GCP uses the overall health state of each VM to determine its readiness for receiving new requests from the LB.

Like other load balancers, TCP LB requires a health check for verifying instance health. To allow traffic from the load balancer and health checker to the instances, you need to configure firewall rules for Source IP ranges to 130.211.0.0/22and 35.191.0.0/16

When you try to access the service running on your backend instances via load balancer frontend IP not returning a 200 OK response, and see the instances are unhealthy then the issue might be inside the guest. The probable causes are the following:

  1. Port configured on backend is not the same as the port configured on health checks.

  2. Service is not running on the port configured at that time.

  3. Service is bound to any specific IP not to all IP addresses i.e 0.0.0.0/0.

Please note that Stackdriver logging is not available for TCP proxy global LB. For debugging, ‘unhealthy’ instances issue, reviewing the access logs or system logs might also help.