I have few basic queries on Kubernetes.
Consider below deployment. A Layer 7 Load Balancer, will route request to NGINX servers through a Kubernetes service, and NGINX will route to Tomcat, Kubernetes service.
Queries:
If Kubernetes service a single point of failure or because it is supported by multiple pods on which kube-proxy is configured and services is just a virtual layer, it cannot be considered as single point of failure?
Above diagram is a single Kubernetes cluster, is this a single point of failure or should I plan for multiple Kubernetes cluster for system where I need to support HA with zero downtime.
Above diagram leverages Kubernetes services which by default supports only L4 Load balancing (round robin only). Hence say a tomcat server is heavily loaded, round robin will not distribute load evenly based on usage. How to achieve load distribution based upon system resource consumption or usage or no. of open connections in the above topology?
Note: no. of rectangular boxes in the above diagrams are representative only. I will be deploying 10 to 20 pods per tier to meet my workload.