I'm looking for guidance on allocating memory for an ECS task. I'm running a Rails app for a client who wants to be as cheap as possible on server cost. I was looking at the medium server size that has 2 CPU and 4 gb memory.
Most of the time I'll only need 1 container running the rails server at a time. However, there are occasional spikes and I want to scale out another server and have the container deployed to it. When traffic slows down, I want to scale back down to the single server / task.
Here's where I need help:
What should I make my task memory setting be? 4GB? That would be the total on the box but doesn't account for system processes. I could do 3 GB, but then I'd be wasting some passionless free memory. Same question for the CPU... should I just make it 100%?
I don't want to pay for a bigger server, i.e. 16 GB to sit there and only have 1 container needed most of the time... such a waste.
What I want seems simple. 1 task per instance. When the instance gets to 75% usage, scale a new instance and deploy the task to the second. I don't get why I have to set task memory and CPU settings when it's a one-to-one ratio.
Can anyone give me guidance on how to do what I've described? Or what the proper task definition settings should be when it's meant to be one-to-one with the instance?
Thanks for any help.
--Edit--
Based on feedback, here's a potential solution:
Task definition = memory reservation is 3 GB and memory is 4 GB.
Ec2 medium nodes, Which have 4 GB
ECS Service autoscaling configured:
- scale up (increase task count by 1) when Service CPU utilization is greater than 75%.
- scale down (decrease task count by 1) when Service CPU utilization is less than 25%.
ECS Cluster scaling configured: - scale up (increase ec2 instance count by 1) when cluster memory utilization is greater than 80%. - scale down (decrease ec2 instance count by 1) when cluster memory utilization is less than 40%.
Example: Starts with 1 EC2 instance running a task with 3 GB reservation. This is 75% cluster utilization.
When the service spikes and CPU utilization of the service jumps to greater than 75%, it will trigger a service scale. Now the task count is increased and the new task is asking for 3 GB again, which makes it a total of 6 GB but only 4 is available so the cluster is at 150% utilization.
This triggers the cluster scale (over 80%) which adds a new ec2 node to the cluster for the new service. When it's there, we're back down to 6GB demand / 8 GB available which is 75% and stable.
The scale down would happen the same.