I have created CloudFormation template that creates ECS service and task and has autoscaling for tasks. It is pretty basic - if MemoruUtilization for tasks reaches certain value then add 1 task and vice verse. Here are some of the most relevant parts form template.
EcsTd:
Type: AWS::ECS::TaskDefinition
DependsOn: LogGroup
Properties:
Family: !Sub ${EnvironmentName}-${PlatformName}-${Type}
ContainerDefinitions:
- Name: !Sub ${EnvironmentName}-${PlatformName}-${Type}
Image: !Sub ${AWS::AccountId}.dkr.ecr.{AWS::Region}.amazonaws.com/${PlatformName}:${ImageVersion}
Environment:
- Name: APP_ENV
Value: !If [isProd, "production", "staging"]
- Name: APP_DEBUG
Value: "false"
...
PortMappings:
- ContainerPort: 80
HostPort: 0
Memory: !Ref Memory
Essential: true
EcsService:
Type: AWS::ECS::Service
DependsOn: WaitForLoadBalancerListenerRulesCondition
Properties:
ServiceName: !Sub ${EnvironmentName}-${PlatformName}-${Type}
Cluster:
Fn::ImportValue: !Sub ${EnvironmentName}-ECS-${Type}
DesiredCount: !Sub ${DesiredCount}
TaskDefinition: !Ref EcsTd
Role: "learningEcsServiceRole"
LoadBalancers:
- !If
- isWeb
- ContainerPort: 80
ContainerName: !Sub ${EnvironmentName}-${PlatformName}-${Type}
TargetGroupArn: !Ref AlbTargetGroup
- !Ref AWS::NoValue
ServiceScalableTarget:
Type: "AWS::ApplicationAutoScaling::ScalableTarget"
Properties:
MaxCapacity: !Sub ${MaxCount}
MinCapacity: !Sub ${MinCount}
ResourceId: !Join
- /
- - service
- !Sub ${EnvironmentName}-${Type}
- !GetAtt EcsService.Name
RoleARN: arn:aws:iam::645618565575:role/learningEcsServiceRole
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
ServiceScaleOutPolicy:
Type : "AWS::ApplicationAutoScaling::ScalingPolicy"
Properties:
PolicyName: !Sub ${EnvironmentName}-${PlatformName}-${Type}- ScaleOutPolicy
PolicyType: StepScaling
ScalingTargetId: !Ref ServiceScalableTarget
StepScalingPolicyConfiguration:
AdjustmentType: ChangeInCapacity
Cooldown: 1800
MetricAggregationType: Average
StepAdjustments:
- MetricIntervalLowerBound: 0
ScalingAdjustment: 1
MemoryScaleOutAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub ${EnvironmentName}-${PlatformName}-${Type}-MemoryOver70PercentAlarm
AlarmDescription: Alarm if memory utilization greater than 70% of reserved memory
Namespace: AWS/ECS
MetricName: MemoryUtilization
Dimensions:
- Name: ClusterName
Value: !Sub ${EnvironmentName}-${Type}
- Name: ServiceName
Value: !GetAtt EcsService.Name
Statistic: Maximum
Period: '60'
EvaluationPeriods: '1'
Threshold: '70'
ComparisonOperator: GreaterThanThreshold
AlarmActions:
- !Ref ServiceScaleOutPolicy
- !Ref EmailNotification
...
So when ever task starts to run out of memory we'll add new task. However at some point we'll reach the limit how much memory are available in out cluster.
So for example is Cluster consists of one t2.small instance then we have 2Gb RAM. A small amount of that is used by ECS task running in instance so we have less then 2GB RAM. If we set the value of Task's memory to 512Mb then we can put only 3 tasks in that cluster unless we scale up the cluster.
By default ECS service has MemoryReservation metrics that can be used for autoscaling cluster. We would tell that when MemoryReservation in more then 75% then add 1 instance to cluster. That's relatively easy.
EcsCluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Sub ${EnvironmentName}-${Type}
SgEcsHost:
...
ECSLaunchConfiguration:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
ImageId: !FindInMap [AWSRegionToAMI, !Ref 'AWS::Region', AMIID]
InstanceType: !Ref InstanceType
SecurityGroups: [ !Ref SgEcsHost ]
AssociatePublicIpAddress: true
IamInstanceProfile: "ecsInstanceRole"
KeyName: !Ref KeyName
UserData:
Fn::Base64: !Sub |
#!/bin/bash
echo ECS_CLUSTER=${EnvironmentName}-${Type} >> /etc/ecs/ecs.config
ECSAutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
VPCZoneIdentifier:
- Fn::ImportValue: !Sub ${EnvironmentName}-SubnetEC2AZ1
- Fn::ImportValue: !Sub ${EnvironmentName}-SubnetEC2AZ2
LaunchConfigurationName: !Ref ECSLaunchConfiguration
MinSize: !Ref AsgMinSize
MaxSize: !Ref AsgMaxSize
DesiredCapacity: !Ref AsgDesiredSize
Tags:
- Key: Name
Value: !Sub ${EnvironmentName}-ECS
PropagateAtLaunch: true
ScalePolicyUp:
Type: AWS::AutoScaling::ScalingPolicy
Properties:
AdjustmentType: ChangeInCapacity
AutoScalingGroupName:
Ref: ECSAutoScalingGroup
Cooldown: '1'
ScalingAdjustment: '1'
MemoryReservationAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
EvaluationPeriods: '1'
Statistic: Average
Threshold: '75'
AlarmDescription: Alarm if MemoryReservation is more then 75%
Period: '60'
AlarmActions:
- Ref: ScalePolicyUp
- Ref: EmailNotification
Namespace: AWS/EC2
Dimensions:
- Name: AutoScalingGroupName
Value:
Ref: ECSAutoScalingGroup
ComparisonOperator: GreaterThanThreshold
MetricName: MemoryReservation
However it does not make sense because that would happen when the third task is added so the new instance will be empty until 4th tasks is scaled. That means we'll be paying for instance that we don't use.
I have noticed that when ECS service tries to add task to cluster where there is not enough free Memory I get
service Production-admin-worker was unable to place a task because no container instance met all of its requirements. The closest matching container-instance ################### has insufficient memory available.
In this example the template's parameters are:
EnvironmentName=Production
PlatformName=Admin
Type=worker
Is it possible to create AWS::CloudWatch::Alarm that looks at ECS cluster events and looks for that particular pattern? The idea would be to scale up instance count in cluster using AWS::AutoScaling::AutoScalingGroup
only when AWS::ApplicationAutoScaling::ScalingPolicy
adds tasks that does not have space in cluster. And scale down the cluster when MemoryReservation is less then 25% (meaning that there are no tasks running there - AWS::ApplicationAutoScaling::ScalingPolicy
has removed them).