I am new to kubernetes and I am having trouble tracking down an exponential backoff signal I am observing on my Jmeter load tests for response times. I have a kubernetes service that is running between 4-32 pods with horizontal pod autoscaling. Each pod is running a Gunicorn WSGI serving a django backend. All of the different k8s services are behind nginx reverse proxy, which redirects incoming traffic directly to Service’s VIP. Nginx sits behind Amazon ELB which is exposed to end-user web traffic. ELB ultimately times out a request after 60 secs.
Each gunicorn server is running one worker with 3 greenlets and has a backlog limit of 1. So it can only server 4 requests at any given time and immediately returns an error response for any extra requests nginx tries to send its way. I am guessing that these error requests are then being caught and retried with exponential backoff, but I can’t quite make out where this is happening.
As far as I know, nginx can’t be the source for the exponential retries, as it is serving only one upstream endpoint for a request. And I couldn’t find anything in the documentation that discusses exponentially timed retries upon error response at any stage in kubernetes routing. The k8s cluster is running on version 1.9.