We have a gRPC application deployed in a cluster (v 1.17.6) with Istio (v 1.6.2) setup. The cluster has istio-ingressgateway setup as the edge LB, with SSL termination. The istio-ingressgateway is fronted by an AWS ELB (classic LB) in passthrough mode. This setup is fully functional and the traffic flows as intended, in general. So the setup looks like:
ELB => istio-ingressgateway => virtual service => app service => [(envoy)pods]
We are running load tests on this setup using GHZ (ghz.sh), running external to the application cluster. From the tests we’ve run, we have observed that each of the app container seems to get about 300 RPS routed to it, no matter the configuration of the GHZ test. For reference, we have tried various combos of --concurrency and --connection settings for the tests. This ~300 RPS is lower than what we expect from the app and, hence, requires a lot more PODs to provide the required throughput.
We are really interested in understanding the details of the physical connection (gRPC/HTTP2) setup in this case, all the way from the ELB to the app/envoy and the details of the load balancing being done. Of particular interest is the the case when the same client, GHZ e.g., opens up multiple connections (specified via the --connection option). We have looked at Kiali and it doesn’t give us the appropriate visibility.
Questions:
- How can we get visibility into the physical connections being setup from the ingress gateway to the pod/proxy?
- How is the “per request gRPC” load balancing happening?
- What options might exist to optimize the various components involved in this setup?
Thanks.