I'm trying to get AutoScalingRollingUpdate
to work on my autoscaling group, by bringing online new instances, then only once the new instance(s) are accepting traffic, terminating the old instances. It seems like AutoScalingRollingUpdate is designed for this purpose.
I have the HealthCheckType of my AutoScalingGroup set to 'ELB'. I also have the HealthCheck on the ELB set to require:
- 3 successful requests to / for "healthy"
- 10 unsuccessful requests to / for "unhealthy"
- no grace period (zero, 0)
Now, from the ELB's perspective, when new instances come online, they are not InService for several minutes, which is what I expect. However, from the AutoScalingGroup's perspective, they are almost immediately being considered InService, and as such, my AutoScalingGroup is taking healthy instances out of service before the new instances are actually ready to receive traffic. I'm confused why the ASG thinks the instances are healthy before the ELB does, when the HealthCheckType is explicitly set to 'ELB'.
I've tried setting a grace period, but this doesn't change anything at all. In fact, I removed the grace period of 300 seconds because I thought maybe instances were implicitly "InService" during the grace period or something.
I know I can set a PauseTime on the rolling update policy, but that is fragile, because sometimes failures happen when instances come online and they get nuked and replaced before they ever finish provisioning, so sometimes, the PauseTime window may be exceeded. Also, I'd like to minimize the amount of time my app is running two different versions at the same time.
... ELB stuff ...
"HealthCheck": {
"HealthyThreshold": "3",
"UnhealthyThreshold": "10",
"Interval": "30",
"Timeout": "15",
"Target": {
"Fn::Join": [
"",
[
{"Fn::Join": [":", ["HTTP", {"Ref": "hostPort"}]]},
{"Ref": "healthCheckPath"}
]
]
}
},
... ASG Stuff ...
{
... snip ...
"HealthCheckType": "ELB",
"HealthCheckGracePeriod": "0",
"Cooldown": "300"
},
"UpdatePolicy" : {
"AutoScalingRollingUpdate" : {
"MinInstancesInService" : "1",
"MaxBatchSize" : "1"
}
}
AutoScalingGroup
setting, it is in your ELB setting.` "HealthCheckGracePeriod": "0",` gives me strange feeling, could you change to300
. After that, ELB will take care of the availability, not ASG. ASG will scale up and down depend on ELB status. – BMW