Request Rate Limit on Google Compute Engine

Question

I am running a Tornado webserver on Google Compute engine. The webserver returns a very simple JSON response. When I test the throughput capacity of this server, it seems to be throttled at 20 req/s. I can not achieve a higher throughput than 20 req/s.

I know that there is a Google Compute Engine API rate limit at 20 req/s. Is there some sort of Network/Instance rate limit that prevents my server fulfilling more than 20 req/s? How do I increase this limit?

Where is your client? On the same VM? 20 req/s sounds ridiculous, as far as I know, there's no rate limit for x req/s, the intra-zone vm-to-vm throughput is up to 14Gpbs and the round trip latency is as low as 100us. You can simply run a Python server by "python -m SimpleHTTPServer 80", and test the rate. — Dagang
My client was installed on a corporate server in California. It turns out that the corporate server was limiting my requests to 20 req/s. When I ran the same client on an Amazon VM I got to 5000+ req/s with no increase in response time — Max Ferguson

Paul R. Nash Paul R. Nash · Accepted Answer · 2016-11-21T10:03:07

The rate limit of 20 requests per second is not on the server, it is on the GCE API - like when you make calls from gcloud to create instances (it calls the GCE API underneath the covers).

As documented here, the network bandwidth of a GCE VM is limited mainly by the software you run on it, and to some extent by the size of the VM (VMs get up to 2 Gbps per core until 8 cores for a max rate of 16 Gbps). Nothing in the VM subsystem knows anything about requests or responses, it's all just IP traffic to us.

Request Rate Limit on Google Compute Engine

1 Answers