1
votes

I have several aws_instance nodes that are in a load balancer target group in Terraform. I made a change that requires destroying each instance and recreating it. By default, Terraform will destroy and recreate all of these instances at the same time. Destroying all of them at once is bad because then no nodes will be in the load balancer.

Is there a way to configure Terraform so it waits for one instance to be fully destroyed/recreated before destroying/creating the other instances?

2
Do you just have straight instances (using the aws_instance resource) or are you using an autoscaling group? This is easier to achieve with an ASG even if you don't need to be able to actually autoscale (set min and max to the same and/or no autoscaling policy).ydaetskcoR
Instances using the aws_instance resource, though I may follow your suggestion below.Kevin Burke

2 Answers

2
votes

You can use the create_before_destroy lifecycle customisation to force Terraform to create the new resource before destroying the old one during a replacement action.

Unfortunately if your instance takes a while to start the service you need then you're still going to have a problem because as soon as the AWS API returns that the instance is running then Terraform will consider it job done and start terminating the old instance that it wants to replace.

You can solve this by having the instances in an autoscaling group (even if you don't need them to autoscale so having the same min and max size or not attaching an autoscaling policy to the group) and setting the health_check_type to ELB. This will make sure that the instance isn't considered healthy until it passes load balancer health checks rather than the default EC2 health checks (ie if it's running and doesn't have a system or instance status check failure). This way, Terraform will wait until the new ASG has the minimum number of instances passing the load balancer health checks (and is attached to the relevant target group or ELB) before it will consider it complete and then start to remove the old ASG.

0
votes

There is the depends_on attribute that allows explicit dependencies to be setup and creating things in order. It is limited for your scenario because it doesn't wait for the new instances to be "ready", just created.

One idea I had when reading your scenario, you could use the extermal data source. Not positive it was intended for this kind of thing but I think it could work. Essentially you could write a script that would use the AWS CLI and whatever else is needed to see if the instance is created and ready. If you use the with depends_on or you chain the output of the external data source to the next instance (use the output to set a tag?), I think it would have the effect you want.

This design smells a bit to me though. There are other AWS services and features that can do this kind of thing for you e.g. ECS rolling deployments with load balancer health checks.

Resources:

https://www.terraform.io/docs/providers/external/data_source.html

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-create-loadbalancer-rolling.html

Edit:

If you are stuck with EC2 instances another native AWS way to solve this might be to use lifecycle hooks. I have used EC2 user data scripts in combination with lifecycle hooks calling Lambda functions to do rolling deployments and configuration of a custom Kafka cluster (before MSK). This required that I bring up instances in order and assigning each instance a unique broker id. Sounds similar to your scenario.

Resource: https://docs.aws.amazon.com/autoscaling/ec2/userguide/lifecycle-hooks.html