1
votes

I'm running a large cluster process that requires hundreds of VMs, but my process is fault-tolerant so I can use preemptible VMs. However, it takes a few minutes of lost time after being preempted for the process to restart, so I'd like to choose the google server and zone that is least busy/least likely to pre-empt my vms.

Is there a way to tell which server is least busy or what servers of google's are least used?

References:

I read this entire thing and it was unhelpful. I'm less concerned about latency than I am my process restarting and it killing time to restart: https://cloud.google.com/solutions/best-practices-compute-engine-region-selection

Preemptible VM google documentation: https://cloud.google.com/compute/docs/instances/preemptible

Google Zones: https://cloud.google.com/compute/docs/regions-zones/

1

1 Answers

2
votes

There is currently no Google Cloud Activity map but that should be a good feature to request [1].

As you probably know and mentioned in the Preemptible doc [2], these machines type last up to 24 hours. As you use a large amount of preemptible VMs and Compute Engine might terminate these instances anytime if its requires access to those resources, I can understand the impact.

Have you tried to spread your cluster over all the zones of the same region? Like us-central1 have 4 different zones, you should try to spread your instances over these 4 zones to lowering the impact. (an idea)