I'm not sure whether that's a solution for you issue or not, but here what I've done when I got:
The requested N CPUs exceeds the allowed maximum of 20.0.
from gcloud ai-platform jobs submit training
. According to this and this links you could pass --scale-tier
argument to submit training
command, which controls some specs of your job including number for workers. In this case, if you set --scale-tier
to STANDARD, PREMIUM or CUSTOM, then CPU workers will scale to new number accordingly (e.g. in your case it's 60.0 CPUs).
Since BASIC tier is "single worker instance", then simply switching to
gcloud ai-platform jobs submit training --scale-tier BASIC-[GPU|TPU]
should solve this quota issue. Point on increasing you quota is valid, but as far as I get it larger number of workers in your case is not desired.
Otherwise, if you want to speed up you training, then you should look at CUSTOM tier and workerCount argument for it, which specifies the number of workers to use (more information on that is here).