0
votes

Whenever I tried to submit training job to gcloud using command

gcloud ml-engine jobs submit training

it gives quota error that is

The requested 60.0 CPUs exceeds the allowed maximum of 20.0.

Even I never define 60.0 CPUs in command. According to google docs, we need to increase quota to make this work. Is there any way to stick to quota 20.0 CPUs and train model on GCP?

2
What's your region? what are all the params of your submission?guillaume blaquiere

2 Answers

1
votes

I'm not sure whether that's a solution for you issue or not, but here what I've done when I got:

The requested N CPUs exceeds the allowed maximum of 20.0.

from gcloud ai-platform jobs submit training. According to this and this links you could pass --scale-tier argument to submit training command, which controls some specs of your job including number for workers. In this case, if you set --scale-tier to STANDARD, PREMIUM or CUSTOM, then CPU workers will scale to new number accordingly (e.g. in your case it's 60.0 CPUs).

Since BASIC tier is "single worker instance", then simply switching to

gcloud ai-platform jobs submit training --scale-tier BASIC-[GPU|TPU]

should solve this quota issue. Point on increasing you quota is valid, but as far as I get it larger number of workers in your case is not desired.

Otherwise, if you want to speed up you training, then you should look at CUSTOM tier and workerCount argument for it, which specifies the number of workers to use (more information on that is here).

0
votes

According to Cloud ML Engine quota documentation, the AI Platform CPUs quotas do not count against the Compute Engine CPUs quota.

There is mentioned that there are certain quotas that can be requested by the Console, looks like AI Platform CPUs quota is not one of them; so, you can request AI Platform CPUs for your training job (default 20) through this form and the process is explained here.

Last but not least, for Free Tier quota increase requests won't be granted, it is necessary to upgrade. I'm not quite sure if this is your case if you have already upgraded you can go ahead to request more quota.