What is the best practice for opening channels on GRPC?

Question

There are services calling service A (10 replicas) via GRpc (100+ req/sec), java generated stubs. We don't have load balancers but I am curios what is the best practice in both cases.

Should clients build the channel on each call to service A or Should I create the managedChannel once until the app shutsdown?

If I create one for each request, calls distributes along 10 replicas but if I create only on application starts all calls goes to the same service A replica.
On the other hand if I create on each call wouldn't there be thousands of connections open until they get idle (which is 30mins by defeault)?

ManagedChannel managedChannel = ManagedChannelBuilder
                    .forAddress(host, port)
                    .usePlaintext()
                    .build()
ServiceA.newBlockingStub(managedChannel)).fooBar(...)

If I understood well your question, I would use a stream client and send messages to the server asynchronously with only one ManagedChannel. Now you are using a blocking client which makes things slow.... — Felipe
The problem is our infra doesn't support long living connections that's why stream is not an option. — hevi

Eric Anderson Eric Anderson · Accepted Answer · 2021-01-19T19:59:36

ManagedChannels should be created seldom and reused heavily. When a ManagedChannel will no longer be used, shutting it down is essential. Otherwise it will leak.

This is a load balancing question and the answer depends on your load balancing architecture. Based on your description, you are likely using one of two structures:

All backends are exposed in DNS. The client connects directly to the backend. I'll call this "exposed"
There is a TCP load balancer that the client creates a connection to and the balancer extends that connection to a backend. I'll call this "hidden"

For both approaches, having backends set nettyServerBuilder.maxConnectionAge(...) is generally necessary for clients to start using new backends.

In the exposed architecture, you simply need to configure load balancing in the ManagedChannel. This is probably as simple as using managedChannelBuilder.defaultLoadBalancingPolicy("round_robin"). The round_robin policy will make a connection to each IP address returned by DNS and distribute RPCs across the addresses. When a backend disconnects due to maxConnectionAge the client will re-resolve DNS and make a new connection.

In the hidden architecture, if you have many clients where each client is "small" compared to each backend, then maxConnectionAge is sufficient. When a backend disconnects due to maxConnectionAge the client will make a new connection to the load balancer which can choose a new backend.

In the hidden architecture, if you have "big" clients that generate more load than a single backend can handle, then things are harder; the client has no visibility into the number of backends and their state, yet the backends can't manage the load themselves. The easiest thing to do here is to create multiple channels and round-robin over them. In Java you can implement this as a Channel so it is hidden from the majority of your code. When a backend disconnects due to maxConnectionAge that one channel will make a new connection to the load balancer which can choose a new backend. The difficulty with this approach is knowing how many channels to make. The hidden architecture with larger clients benefits a lot from using HTTP load balancing instead of TCP load balancing. Even with HTTP load balancing it may be necessary to use multiple managed channels, in order to balance the load balancer's load.

What is the best practice for opening channels on GRPC?

1 Answers