I have a little specific use-case here. I need to auto-scale a distributed web app running on ECS Fargate. The catch is that all the nodes need to keep the same data in memory (so increasing number of nodes does not help with memory pressure). Thus the increasing load can only be properly handled if it scales both horizontally (adding nodes) and vertically (increasing nodes memory).
Horizontal auto-scaling is simple. AWS CDK provides nice high-level constructs for load-balanced Fargate tasks and makes it super easy to add more tasks to handle CPU load:
service = aws_ecs_patterns.ApplicationLoadBalancedFargateService(
self,
'FargateService',
cpu=256,
memory_limit_mib=512,
...
)
scalable_target = service.service.auto_scale_task_count(max_capacity=5)
scalable_target.scale_on_cpu_utilization('CpuScaling', target_utilization_percent=60)
What I'm looking for is the vertical scaling part. So far my best idea is the following:
- Create a CloudWatch alarm for memory usage of the cluster. Trigger over 60%.
- The alarm sends a message to an SNS topic, which triggers a lambda function.
- The lambda describes the current task definition and parses out CPU and memory parameters. Then it creates a new version of the task definition with increased memory (and CPU if needed, because CPU and memory are not independent values in Fargate).
- Finally the lambda updates the service with the new task definition. This should trigger a rolling update and result in a cluster with the same number of nodes, but each with bigger memory.
Do you think this could work? Is there any better solution? Any potential issues you can spot?
Thanks in advance for any ideas!