I've been given a AWS environment to look after and it runs ECS on EC2 instances and has scaling configured using ECS Memory Reservation. The system was originally running before Cluster Autoscaling was made generally available so it's just using a cloudwatch metric to scale out and scale in. As far as I can work out it is following a basic AWS design.
- The EC2 has an autoscaling group and allows scale from 1 to 5 instances with 1 being the desired state.
- There is 1 cluster service running with 6 tasks configured.
- 5 of those tasks are configured to run up to 2 copies of the task maximum and 1 the desired, the other is set to maximum of 1.
- The tasks have MemoryReservation (soft limit) figures configured but not Memory (hard limit).
- The tasks are primarily running Java.
- The highest memory reservation is set at about 200MB and most are around this figure.
- The scale out rule is based on MemoryReservation at 85%.
- Docker stats shows most of the tasks are running about 300MB and some exceed 600MB.
- The instance size has 4GB of RAM.
If the maximum reservation is 2GB, even if the tasks are consuming more like 3GB in reality, am I right in believing that the scale out rule will NEVER be invoked because 2GB is 50% of available RAM? Do I need to increase the memory reservations to something more realistic?
Also if it is only running a single EC2 instance am I right in thinking even if I increased the MemoryReservation figures to something more realistic, just because there's no theoretical room to start another task it won't spin up a second EC2 instance automatically? Just picked this up from different articles I've been reading when searching.
Thanks