On k8s, when I requests to pods and all pods are in use (not ready), the requests will be timeout immediately. I want to hold the request until a pod is ready and then the request is sent to the pod.
Do you know a sort of timeout duration settings for load balancing? Also, I couldn't find any relevant documentation on this matter, am I fundamentally misunderstanding something?
PS: I use Readiness probe. The case I say is that Readiness probes of all pod return false, so all pods are in use.