1
votes

Background

We have a dispatcher instance group that receives around 700 requests per second per active VM. This dispatcher is behind a Load Balancer that auto scales. Thus far all our VMs are regaular VMs, however we have been studying the possibility of making them preemptive.

The problem with preemptive instances

According to the documentation GCP can terminate a preemptive instance at any time.

Let's assume that each dispatcher VM holds no state. It receives a request, processes it and makes an HTTP request to some other machine.

At any given time, each VM will be processing around 700 requests concurrently, while receiving data from the load balancer.

Question

What happens if my preemptive VM, processing 700 requests, receives a signal to be terminated?

Well, in theory one should have a shutdown script that makes sure processing those requests finishes and then kills the app (clean exit). This leads us to the big question:

  • But does the load balancer know that my VM is shutting down? Will it keep sending requests to the terminating VM?

Considerations

If yes, then it means some requests will fail because once the app shuts down, the machine is still up and the load balancer keeps on sending requests to the machine, not knowing the app is already down.

Ideally, these requests would go back as failed requests to the load balancer and it would send the requests to another machine. However GCP load balancers are not smart enough to do this, and so they don't.

If somehow the load balancer knows this VM was selected for preemtive termination than nothing special needs to be done.

Which one is it?

2

2 Answers

2
votes

But does the load balancer know that my VM is shutting down? Will it keep sending requests to the terminating VM?

Yes, the load balancer will continue to send requests to the instance.

You will need to create a shutdown script and remove your instance from the load balancer.

It is not that the load balancer is not smart enough. The load balancer does not know if your requests can be retried. That decision should be made by the client / backend logic.

Your use case is not a good example for preemptive instances. Preemptive instances will be terminated every 24 hours. If your goal is cost savings, compare the cost of long term instance pricing to preemptive pricing. The savings are not enough to justify the engineering, testing and QA costs.

Architectures should be designed for failure, but I would not deliberately pick an architecture that will fail constantly. In your case every 24 hours. There is also the risk that you will not be able to launch another instance to make up for the increased load. And there is the risk that all your instances will be terminated.

1
votes

We had similar problem. We have almost solved it with Load balancer health checks (with some issues in very high load condition). Trick is now within 10-15 seconds of preempt signal, load balancer will mark instance as unhealthy as stops sending new requests to that instance.

Solution:

  1. Load balancer check for health of instance every 3 seconds and marks instance as unhealthy after third health check fails.Thus load balancer marks instance in about 10 seconds and stop sending new request.
  2. Trap Preempt signal in Java using ContextCloseEvent (Spring boot) or Runtime.getRuntime().addShutdownHook() (In my case it took couple of seconds before signal was received by JVM)
  3. Set health checks to fail i.e. health check endpoint will start returning 404.
  4. Sleep in shutdown block for 15-25 seconds to let in progress and new requests to complete
  5. Release resources and do shutdown logging.

    
    @EventListener
    public void onShutdown(ContextClosedEvent event) {
    
    
    log.warn("shutdown event received {}", event.getSource().toString());
    log.warn("/ping will respond 404, Main thread will sleep for 20 seconds to allow pending tasks to complete");
    
    isShuttingDown = true;
    try {
        Thread.sleep(SLEEP_BEFORE_SHUTDOWN_MILLIS);
    } catch (InterruptedException e) {
        log.error("sleep before shutdown interrupted", e);
    }
    
    log.warn("Shutting down now, daemon threads will continue work");
    releaseResources(); 
    
    log.info("{} {} on {} stopped.", NAME, VERSION, HOSTNAME);
    
    } //health endpoint @RequestMapping(value = "ping", produces = MediaType.TEXT_PLAIN_VALUE) public ResponseEntity ping() { if(isShuttingDown()) { log.warn("health failed - shutting down soon"); return new ResponseEntity(HttpStatus.NOT_FOUND); } return ResponseEntity.ok("pong"); }