2
votes

In my terraform infrastructure, I spin up several Kubernetes clusters based on parameters, then install some standard contents to those Kubernetes clusters using the kubernetes provider.

When I change the parameters and one of the clusters is no longer needed, terraform is unable to tear it down because the provider and resources are both in the module. I don't see an alternative, however, because I create the kubernetes cluster in that same module, and the kubernetes object are all per kubernetes cluster.

All solutions I can think of involve adding a bunch of boilerplate to my terraform config. Should I consider generating my terraform config from a script?


I made a git repo that shows exactly the problems I'm having:

https://github.com/bukzor/terraform-gke-k8s-demo

1
What error do you get? Can you provide a minimal reproducible example that reproduces the issue? You absolutely can create and destroy things that have a dependent provider in one go with Terraform but it does depend slightly on the provider and also how you write things.ydaetskcoR
@ydaetskcoR: Here you go! github.com/bukzor/terraform-gke-k8s-demobukzor

1 Answers

3
votes

TL;DR

Two solutions:

  1. Create two separate modules with Terraform
  2. Use interpolations and depends_on between the code that creates your Kubernetes cluster and the kubernetes resources:

    resource "kubernetes_service" "example" {
      metadata {
        name = "my-service"
      }
    
      depends_on = ["aws_vpc.kubernetes"]
    }
    
    resource "aws_vpc" "kubernetes" {
      ...
    }
    

When destroying resources

You are encountering a dependency lifecycle issue

PS: I don't know the code you've used to create / provision your Kubernetes cluster but I guess it looks like this

  1. Write code for the Kubernetes cluster (creates a VPC)
  2. Apply it
  3. Write code for provisionning Kubernetes (create an Service that creates an ELB)
  4. Apply it
  5. Try to destroy everything => Error

What is happenning is that by creating a LoadBalancer Service, Kubernetes will provision an ELB on AWS. But Terraform doesn't know that and there is no link between the ELB created and any other resources managed by Terraform. So when terraform tries to destroy the resources in the code, it will try to destroy the VPC. But it can't because there is an ELB inside that VPC that terraform doesn't know about. The first thing would be to make sure that Terraform "deprovision" the Kubernetes cluster and then destroy the cluster itself.

Two solutions here:

  1. Use different modules so there is no dependency lifecycle. For example the first module could be k8s-infra and the other could be k8s-resources. The first one manages all the squeleton of Kubernetes and is apply first / destroy last. The second one manages what is inside the cluster and is apply last / destroy first.

  2. Use the depends_on parameter to write the dependency lifecycle explicitly

When creating resources

You might also ran into a dependency issue when terraform apply cannot create resources even if nothing is applied yet. I'll give an other example with a postgres

  1. Write code to create an RDS PostgreSQL server
  2. Apply it with Terraform
  3. Write code, in the same module, to provision that RDS instance with the postgres terraform provider
  4. Apply it with Terraform
  5. Destroy everything
  6. Try to apply everything => ERROR

By debugging Terraform a bit I've learned that all the providers are initialized at the beggining of the plan / apply so if one has an invalid config (wrong API keys / unreachable endpoint) then Terraform will fail.

The solution here is to use the target parameter of a plan / apply command. Terraform will only initialize providers that are related to the resources that are applied.

  1. Apply the RDS code with the AWS provider: terraform apply -target=aws_db_instance
  2. Apply everything terraform apply. Because the RDS instance is already reachable, the PostgreSQL provider can also initiate itself