4
votes

I have configured an AWS ECS Cluster with 3 instances (m5.large), with one instance across each availability zones (A, B, and C). The Service is configured as follows:

  • Service type: REPLICA
  • Number of Tasks: 3
  • Minimum Healthy Percent: 30
  • Maximum Percent: 100
  • Placement Templates: AZ Balanced Spread
  • Service AutoScaling: No.

In the Task Definition, I have used the following:

  • Network Mode: awsvpc
  • Task Memory: --
  • Task CPU: --

At the container level, I have configured only Memory Soft Limit:

  • Soft Limit: 2048 MB
  • Hard Limit: --

I have used awslogs for logging. The above configuration works and when I start the service, there is one docker running in each of the instances. The 'docker stats' in one of the instances shows the following:

MEM USAGE / LIMIT  
230MiB    / 7.501GiB

And the container instance (ECS Console) shows the following:

Resources   Registered  Available  
CPU             2048       2048  
Memory          7680       5632  
Ports        5 ports

The above results are the same across all the 3 instances -- 2 GB of memory has been reserved (soft limit) and upper memory limit is instance memory of nearly 8 GB (no hard limit set). Everything works as expected so far.

But when I re-deploy the code (using force deploy) from Jenkins, I get the following error in the Jenkins Log:

"message": "(service App-V1-Service) was unable to place a task because no container instance met all of its requirements. The closest matching (container-instance 90d4ba21-4b19-4e31-c42d-d7223b34f17b) has insufficient memory available. For more information, see the Troubleshooting section of the Amazon ECS Developer Guide.

In Jenkins, the job shows up as 'Success', but it is the old version of the code that is running. There is sufficient memory available on all the three instances. Also, the I have changed the Minimum Healthy Percent to 30 hoping that ECS can stop the container and re-delpoy the new one. Any solution or pointers to debug this further will be of great help.

1

1 Answers

3
votes

As during deployment, the ECS schedule will allocate memory base on soft limit for each container which can be

2048 * 3 = 6144 MB 

which is less than the available memory in the instance

5632 (available memory) < 6144 (required memory)

If you running replica in the same ECS container instance then I will recommend to keep minimum soft limit which should be less or equal to 1GB also this is suggested by ECS as well.

So with this configuration, you will be run blue-green deployment as well. As this nothing harm to keep the soft limit minimum as container can scale to use more memory when it's required so applying some big memory for soft limit does not affect the performance.

I will not recommend lowering the Minimum Healthy Percent: 0 as decrease the soft limit to 1GB will resolve the issue.

Or if you want to keep the same memory limit then decrease Minimum Healthy Percent