How to resolve google app engine latency?

Question

Our project is running on the Google App Engine standard environment with auto-scaling configured as mentioned below. Warm up requests are enabled in the app and we are using Google Endpoints service. However, I am facing a latency issue in the different scenarios. Environment: Java 8, Instance type: F4_1G Configuration for autoscaling: min-instances: 2 max-concurrent-requests: 80 min-pending-latency: 6s max-pending-latency: 10s

I tested with JMeter with configuration of sending 85 asynchronous requests with a ramp up period of 10 seconds. From the application logs I can notice that appengine takes a long time to serve the request.Below are the questions I have

1.Most of the requests are failing because of time exceed. In image 1, we can spot that the request takes 88.2 seconds. I know that AppEngine auto scaling has a 60 seconds request timeout limit. But we have configured autoscaling with a minimum 2 instances and there is no restriction for max-instance. The AppEngine Instance should handle the request otherwise AppEngine should scale up to handle the request. Why is it not happening? Image_1

While scaling up, the request takes 43.6 seconds. In image 2, we are able to see that the request came at 20:27:01:663 IST and the first line of API execution starts at 20:27:40:407 IST. What is happening in between time? Can I get a log for this period? Image_2
After the scaleup, subsequent requests also take a very long time to serve. For instance an API request usually gets completed within 2 seconds. In image 3, we can note that API takes 42.4s without loading-request process and then request comes at 20:27:01:728 IST. The first line of API execution starts at 20:27:40:708 IST. What is happening in between time? Image_3

Jan Hernandez Jan Hernandez · Accepted Answer · 2020-04-06T17:04:30

I think that is related with Java 8 runtime that takes long time to deploy a new instance, because java is a heavy runtime.

The deploy time exceed 60s and your unattended requests will be ended by timeout.

I think that you can improve your you escalation strategy, for example try to start your service with more instances and add this option "target_throughput_utilization" in order to start to raise a new instance before hit your 80 concurrent requests

the documentation state that: 'When the number of concurrent requests reaches a value equal to max-concurrent-requests times target-throughput-utilization, the scheduler starts a new instance.'

min-instances: 4 
max-concurrent-requests: 80
target_throughput_utilization:0.75
min-pending-latency: 6s 
max-pending-latency: 10s

In my example the new instance will start when the actual instance has (80 X 0.75) 60 concurrent requests

How to resolve google app engine latency?

1 Answers