I’m learning to work with Cloud Functions (and Cloud Run) and would like to know it’s behaviour when (HTTP triggered) function already running at Max instances capacity and more HTTP requests come-in
1) Here’s my function basic code (simple function with approx. 1000ms execution time per invocation):
ctr = 0
def hello_world(request):
global ctr
print("hello_world(): "+str(ctr))
ctr=ctr+1
time.sleep(10)
response = flask.Response("success::", 200)
return response
2)
deployed this function with flag --max-instances=1
(just to ensure no new VM instances come up to handle concurrent requests)
3) and then send 5 concurrent results
From what I observe, only one of the requests get processed. Other 4 requests just dropped (client received HTTP status code 500, and no trace for these dropped requests in Stackdriver Logging either)
In the link here https://cloud.google.com/functions/docs/max-instances it says:
In that case, incoming requests queue for up to 60 seconds. During this 60 second window, if an instance finishes processing a request, it becomes available to process queued requests. If no instances become available during the 60 second window, the request fails.
Based on which I expected, that while one request is being handled, others would be queued up for max 60 seconds. Therefore, all 5 requests should have been processed (or at-least >1 requests if not all 5!). However actual behaviour that I'm seeing is different from this.
Can someone explain pls.
UPDATE-1: It seems the fix has been released and documentations updated
1) It still continued to return status code 500
during initial colds start (when no instances running) for some of the concurrent requests.. EXPECTED I suppose
2) Also, temporarily exceeded max-instances=1
during very initial bursts of 10 requests, launching upto 4 instances AND successfully server all 4 requests.
3) thereafter, # instanced did come down to respect --max-instances=1settings and all but one requests returned with
status code 429`