Adding GKE node pool with GPU using terraform

Question

I try to create google_container_node_pool with GPUs. I tried machine type nvidia-tesla-p4 and a2-highgpu-1g, each return a different error:

projects/my-project-id/zones/us-central1-a/machineTypes/nvidia-tesla-p4

or

Error: error creating NodePool: googleapi: Error 403: Insufficient regional quota to satisfy request: resource "PREEMPTIBLE_NVIDIA_V100_GPUS": request requires '3.0' and is short '2.0'. project has a quota of '1.0' with '1.0' available. View and manage quotas at https://console.cloud.google.com/iam-admin/quotas?usage=USED&project=my-project-id., forbidden

When I check the quotas page, the relevant quota shows "All 99 quotas are within limit".

According to the requirement I need quota but they don't specify which quota.

Update:

Changing the machine_type to a2-highgpu-1g changed the error message to relate to a different quota, A2_CPUS. When I change the value of preemptible to false, instead of PREEMPTIBLE_NVIDIA_V100_GPUS or A2_CPUS I get the same error for NVIDIA_A100_GPUS. The problem with both A2_CPUS and NVIDIA_A100_GPUS is that I can't ask for quota as the checkbox in the UI is disabled and it shows limit as "Unlimited":

It seems like there already 2 answers in this question. If one solves your question please vote on or accept one as detailed here. In case you need more clarification remember you can also comment on the answers as well. — Judith Guzman
I gave up after few attempts, mostly due to other pressing matters, I will get back to it on the next version I hope, it may take few weeks. — Johnathan Kanarek
Please see my updated answer. TL;DR you should request an increase of the REGIONAL quota, as zonal quota is not actionable. — Judith Guzman
Also, make sure you have enough CPU + A2 CPU quota in the region — Judith Guzman

hilsenrat hilsenrat · Accepted Answer · 2021-01-15T14:21:47

You don't see an error in the Quotas page because there wasn't a violation of your quotas, since the nodes weren't created.

For example, if you want to create a node pool with 3 nodes that each one has 1 V100 GPU, go to to the Quotas page and request to extend the number of PREEMPTIBLE_NVIDIA_V100_GPUS from 1 to 3. Repeat with the relevant numbers per each GPU and zone.
Please note that you should wait until GCP approves your requests before trying to create the resources again in Terraform.

If you don't wish to extend the quotas and just want to check your TF configuration, just minimize the number of GPU nodes to a number that doesn't violate your quotas.

Adding GKE node pool with GPU using terraform

2 Answers