0
votes

We have a terraform deployment that creates an auto-scaling group for EC2 instances that we use as docker hosts in an ECS cluster. On the cluster there are tasks running. Replacing the tasks (e.g. with a newer version) works fine (by creating a new task definition revision and updating the service -- AWS will perform a rolling update). However, how can I easily replace the EC2 host instances with newer ones without any downtime?

I'd like to do this to e.g. have a change to the ASG launch configuration take effect, for example switching to a different EC2 instance type.

I've tried a few things, here's what I think gets closest to what I want:

  1. Drain one instance. The tasks will be distributed to the remaining instances.
  2. Once no tasks are running in that instance anymore, terminate it.
  3. Wait for the ASG to spin up a new instance.
  4. Repeat steps 1 to 3 until all instances are new.

This works almost. The problem is that:

  1. It's manual and therefore error prone.
  2. After this process one of the instances (the last one that was spun up) is running 0 (zero) tasks.

Is there a better, automated way of doing this? Also, is there a way to re-distribute the tasks in an ECS cluster (without creating a new task revision)?

1
I've just updated my old answer at stackoverflow.com/a/39977487/2291321 because (as mentioned in the original comments) it didn't take into account safely draining ECS tasks from the old ASG before destroying them which led to a service interruption.ydaetskcoR

1 Answers

0
votes

Prior to making changes make sure you have the ASG spanned across multiple availability zones and so are the containers. This ensures High Availability when instances are down in one Zone.

You can configure an update policy of Autoscaling group with AutoScalingRollingUpgrade where you can set MinInstanceInService and MinSuccessfulInstancesPercent to a higher value to maintain slow and safe rolling upgrade.

You may go through this documentation to find further tweaks. To automate this process, you can use terraform to update the ASG launch configuration, this will update the ASG with a new version of launch configuration and trigger a rolling upgrade.