2
votes

I'm relatively new to using AWS Batch, and have been noticing it takes a LONG time to spin up EC2 instances in a managed compute environment.

My jobs will go from Submitted > Pending > Runnable within 1 minute.

But sometimes they will sit in Runnable anywhere from 15 minutes to 1 hour before an EC2 instance finally gets around to spinning up.

Any tips and tricks on getting AWS Batch to spin up instances more quickly?

Ideally I'd like an instance the moment somethings in the Runnable state.


For some more context, I am using AWS Batch essentially like Lambda but choose your own instance and hard drive. I can't use lambda because the jobs need a lot more resources (GPUs) and time to process.

2

2 Answers

0
votes

It would appear the scheduler takes its time based on non-transparent load at the data center.

Would love if creating a Batch Job returned estimated TTL.

But anyways, sometimes I get machines instantly, sometimes it takes up to 15 minutes, and sometimes it will take an hour or more for newer GPU instance types, because there are not any available.

There doesn't appear to be anyway to control the schedule. Oh well.

0
votes

Note: Below setting might help reduce provision time, but will incur additional costs.

Compute environments -> Compute resources -> Minimum vCPUs

Making this = 1 (or more) will allow single instance to run all the time.

Compute environments -> Compute resources -> Allocation strategy

Changing this from "BEST_FIT" to "Best_Fit_Progressive" will also help.