5
votes

Looking to understand AWS Classic Load Balancing here. I have an elastic beanstalk web application running behind a classic ELB where I am currently the only traffic (under development). When I send a request, the first time it always takes exactly 1.26 minutes. If I send another request shortly after it responds instantly. However, if I wait for what seems like a couple minutes, it will revert back to the 1.26 minute response time.

I have tried both enabling and disabling cross-zone load balancing and sticky sessions, with no change to the response times.

I have also tried running the request from different processes, and 1 request succeeding on 1 process doesn't make the other process respond instantly (they both have a slow first request).

I am looking to understand why this is happening and how do I prevent this from happening for all users for their first request?

I have 2 availability zones and there is 1 ec2 instance in each zone running the application. I also have them in private subnets and the load balancer in 2 public subnets corresponding to each availability zone.

I am also using nginx for elastic beanstalk, here is my config:

upstream api {
        server 127.0.0.1:3000;
      }

server {
        listen 8080;
        server_name api.mydomain.com;

        if ($http_x_forwarded_proto = "http") {
            return 301 https://$host$request_uri;
        }

        if ($time_iso8601 ~ "^(\d{4})-(\d{2})-(\d{2})T(\d{2})") {
            set $year $1;
            set $month $2;
            set $day $3;
            set $hour $4;
        }
        access_log /var/log/nginx/healthd/application.log.$year-$month-$day-$hour healthd;
        access_log  /var/log/nginx/access.log  main;

        location / {
            proxy_pass  http://api;
            proxy_set_header   Connection "";
            proxy_http_version 1.1;
            proxy_set_header        Host            $host;
            proxy_set_header        X-Real-IP       $remote_addr;
            proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        }

        gzip on;
        gzip_comp_level 4;
        gzip_types text/html text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;

      }

Update: I believe we can rule out the following:

  1. I have verified that DNS is resolved instantly (tried pinging and then requesting (no effect), and using dig api.mydomain.com responds instantly)
  2. The application is not receiving the request for 1.27 minutes (so it's not a CPU/database latency issue).

It seems to be something between when the load balancer gets the request and when the ec2 instance gets the request, but I'm not sure how to debug that. I am seeing a lot of TCP Retransmissions in wireshark before the response returns.

1
I wonder if it could be caused by DNS resolution? When the ELB is accessed, the DNS Name needs to be resolved to an IP address. This is then cached for a period. You could test this by resolving the DNS name first (eg ping elb-dns-name) and then send the request.John Rotenstein
successfully ping'ed, then curl'ed, and same issueJBaczuk
probably application compiles assets or something, do you use healthchecking?NeverBe
@NeverBe are you saying that AWS will shut down the application after a while? The application startup does take a little while but not that long.JBaczuk
@JBaczuk did you manage to debug this?Shahzaib Sheikh

1 Answers

0
votes

It might be because of the incorrect VPC configurations!

Recently, I bumped into the same issue and for me, route table was the culprit. One of my public subnets that were attached to the ALB was not routing the traffic to the internet gateway.

So, just make sure the route table attached to your public subnets attached to your loadbalancer has a route to your internet gateway for destination range 0.0.0.0/0.

This post made me check my route tables and other VPC settings.