I am creating an Auto Scaling Group (ASG) with a CloudFormation (CF) template, and have enabled both EC2 and ELB health checks. What I would like to see is that CF refrain from marking the ASG as a successful deployment until the ASG meets its minimum instance count when considering both the EC2 and ELB health checks. However, this is not the behavior I'm seeing.
If, for instance, the first instances in the ASG fail their EC2 health checks, the ASG creates new instances as expected until it meets its minimum requirement. In the meantime, the ASG is listed as a CREATE_IN_PROGRESS
resource from CF's perspective. If the ASG never meets its minimum healthy instance count, the ASG remains in CREATE_IN_PROGRESS
indefinitely. Although not an ideal outcome, at least it's obvious there's a problem and will eventually trigger human intervention.
If, however, the first instances pass their EC2 health checks but fail their ELB health checks, the ASG creates new instances as before until it meets its minimum requirement. In the meantime, however, the ASG is listed as CREATE_COMPLETE
from CF's perspective, even before the HealthCheckGracePeriod
is elapsed. Long after the CF stack has deployed and considered itself CREATE_COMPLETE
, the ASG is still cycling through instances that never meet the ELB health checks. This is the behavior I would like to prevent.
According to the Auto Scaling Health Check docs:
All instances in your Auto Scaling group start in the healthy state. Instances are assumed to be healthy unless Amazon EC2 Auto Scaling receives notification that they are unhealthy. This notification can come from one or more of the following sources: Amazon EC2, Elastic Load Balancing (ELB), or a custom health check.
This description matches the behavior I'm seeing, so it appears CF w/ASG is operating as designed/documented, but in my opinion, the eager reporting of health on one of the two configured criteria seems optimistic. I would expect both criteria to pass at least once before claiming success.
Here is an abbreviated CF template snippet for the ASG:
Resources:
MyAppScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
AutoScalingGroupName: !Ref InstanceName
CapacityRebalance: true
HealthCheckGracePeriod: 300 # seconds
HealthCheckType: ELB
LaunchTemplate:
LaunchTemplateId: !Ref MyAppLaunchTemplate
Version: !GetAtt MyAppLaunchTemplate.DefaultVersionNumber
MaxSize: !Ref MaxInstances
MinSize: !Ref MinInstances
TargetGroupARNs: [ !Ref MyAppTargetGroup ]
VPCZoneIdentifier: !Split [ ",", !Ref Subnets ]