1
votes

I have setup Auto scaling group and setup grace period to 300 (5mins). My new instance takes max 2.5 mins to boot up and ready to handle HTTP requests. But I am noticing that each time my new instance is added ELB starts forwarding traffic to new instance even way before grace period i.e 5mins. Due to which I am facing 502 Bad Gateway error.

Can anyone guide me why my application load balancer is behaving like it?

I am using ELB type health checks and below are settings of my target group health check

Protocol : HTTP

Port : 80

Healthy threshold : 2

Unhealthy threshold : 10

Timeout : 10

Interval : 150

Success codes : 200

2

2 Answers

1
votes

This is a normal behavior. Grace period is not there to prevent health checks from happening. This holds true for both ELB and EC2 service health checks. During the grace period that you specify, both ELB and EC2 service will send health checks to your instance. The difference here is that auto-scaling will not act upon the results of these checks. Which means that auto-scaling will not automatically schedule instance for replacement.

Only after the instance is up and running correctly (passed ELB and EC2 health checks), will ELB register the instance and starts sending normal traffic to it. But this can happen before the grace period expires. If you see 502 Error after the instance has been registered with ELB then your problem is somewhere else.

0
votes

Finally I resolved my issue. I am writing my solution here to help anyone else here facing same issue.

In my case, my initial feeling was that Application Load Balancer is routing traffic to newly added instance before it is ready to serve. But detailed investigation showed that was not the issue. In my case new instance was able to serve traffic at start and after few mins it was generating this ELB level 502 error for around 30 seconds and after that it starts working normally.

Solution: The Application has a default connection KeepAlive of 60 seconds. Apache2 has a default connection KeepAlive of 5 seconds. If the 5 seconds are over, the Apache2 closes its connection and resets the connection with the ELB. However, if a request comes in at precisely the right time, the ELB will accept it, decide which host to forward it to, and in that moment, the Apache closes the connection. This will result in said 502 error code.

I set the ELB timeout to 60 seconds and the Apache2 timeout to 120 seconds. This solved my problem.