Background
We have a dispatcher instance group that receives around 700 requests per second per active VM. This dispatcher is behind a Load Balancer that auto scales. Thus far all our VMs are regaular VMs, however we have been studying the possibility of making them preemptive.
The problem with preemptive instances
According to the documentation GCP can terminate a preemptive instance at any time.
Let's assume that each dispatcher VM holds no state. It receives a request, processes it and makes an HTTP request to some other machine.
At any given time, each VM will be processing around 700 requests concurrently, while receiving data from the load balancer.
Question
What happens if my preemptive VM, processing 700 requests, receives a signal to be terminated?
Well, in theory one should have a shutdown script that makes sure processing those requests finishes and then kills the app (clean exit). This leads us to the big question:
- But does the load balancer know that my VM is shutting down? Will it keep sending requests to the terminating VM?
Considerations
If yes, then it means some requests will fail because once the app shuts down, the machine is still up and the load balancer keeps on sending requests to the machine, not knowing the app is already down.
Ideally, these requests would go back as failed requests to the load balancer and it would send the requests to another machine. However GCP load balancers are not smart enough to do this, and so they don't.
If somehow the load balancer knows this VM was selected for preemtive termination than nothing special needs to be done.
Which one is it?