I create a deployment which results in 4 pods existing across 2 nodes.
I then expose these pods via a service which results in the following cluster IP and pod endpoints:
Name: s-flask
......
IP: 10.110.201.8
Port: <unset> 9080/TCP
TargetPort: 5000/TCP
NodePort: <unset> 30817/TCP
Endpoints:
192.168.251.131:5000,192.168.251.132:5000,192.168.251.134:5000 + 1 more...
If accessing the service internally via the cluster IP, the requests are balanced across both nodes and all pods, not just the pods on a single node (e.g. like access via a nodePort).
I know kubernetes uses IP tables to balance requests across pods on a single node, but I can't find any documentation which explains how kubernetes balances internal service requests across multiple nodes (we are don't use load balancers or ingress for internal service load balancing).
The cluster IP itself is virtual, the only way I think this can work, is if the cluster IP is round robin mapped to a service endpoint IP address, where the client would have to look up the cluster IP / service and select an endpoint IP?