13
votes

We are utilizing Terraform heavily for AWS Cloud provisioning. Our base terraform structure looks like this:

├─ modules
├── x
├── y
├─ environments
├── dev
│   ├── main.tf
│   ├── output.tf
│   └── variables.tf
└── uat
│   ├── main.tf
│   ├── output.tf
│   └── variables.tf
└── prod
    ├── main.tf
    ├── output.tf
    └── variables.tf

As we reached a point where we have many modules and many environments, code duplication becomes a more serious headache now, we would like to get rid of as much of it as possible.

Our main concern currently is with the output.tf files - every time we extend an existing module or add a new module, we need to set up the environment specific configuration for it (this is expected), but we still have to copy/paste the required parts into output.tf to output the results of the provisioning (like IP addresses, AWS ARNs, etc.).

Is there a way to get rid of the duplicated output.tf files? Could we just define the wanted outputs in the modules themselves and see all defined outputs whenever we run terraform for a specific environment?

3

3 Answers

5
votes

We built and open sourced Terragrunt to solve this very issue. One of Terragrunt's features is the ability to download remote Terraform configurations. The idea is that you define the Terraform code for your infrastructure just once, in a single repo, called, for example, modules:

└── modules
 ├── app
 │   └── main.tf
 ├── mysql
 │   └── main.tf
 └── vpc
     └── main.tf

This repo contains typical Terraform code, with one difference: anything in your code that should be different between environments should be exposed as an input variable. For example, the app module might expose the following variables:

variable "instance_count" {
  description = "How many servers to run"
}

variable "instance_type" {
  description = "What kind of servers to run (e.g. t2.large)"
}

In a separate repo, called, for example, live, you define the code for all of your environments, which now consists of just one .tfvars file per component (e.g. app/terraform.tfvars, mysql/terraform.tfvars, etc). This gives you the following file layout:

└── live
    ├── prod
    │   ├── app
    │   │   └── terraform.tfvars
    │   ├── mysql
    │   │   └── terraform.tfvars
    │   └── vpc
    │       └── terraform.tfvars
    ├── qa
    │   ├── app
    │   │   └── terraform.tfvars
    │   ├── mysql
    │   │   └── terraform.tfvars
    │   └── vpc
    │       └── terraform.tfvars
    └── stage
        ├── app
        │   └── terraform.tfvars
        ├── mysql
        │   └── terraform.tfvars
        └── vpc
            └── terraform.tfvars

Notice how there are no Terraform configurations (.tf files) in any of the folders. Instead, each .tfvars file specifies a terraform { ... } block that specifies from where to download the Terraform code, as well as the environment-specific values for the input variables in that Terraform code. For example, stage/app/terraform.tfvars may look like this:

terragrunt = {
  terraform {
    source = "git::[email protected]:foo/modules.git//app?ref=v0.0.3"
  }
}

instance_count = 3
instance_type = "t2.micro"

And prod/app/terraform.tfvars may look like this:

terragrunt = {
  terraform {
    source = "git::[email protected]:foo/modules.git//app?ref=v0.0.1"
  }
}

instance_count = 10
instance_type = "m2.large"

See the Terragrunt documentation for more info.

3
votes

One way to resolve this is to create a base environment, and then symlink the common elements, for example:

├─ modules
├── x
├── y
├─ environments
├── base
│   ├── output.tf
│   └── variables.tf
├── dev
│   ├── main.tf
│   ├── output.tf -> ../base/output.tf
│   └── variables.tf -> ../base/variables.tf
├── uat
│   ├── main.tf
│   ├── output.tf -> ../base/output.tf
│   └── variables.tf -> ../base/variables.tf
├── super_custom
│   ├── main.tf
│   ├── output.tf # not symlinked
│   └── variables.tf # not symlinked
└── prod
    ├── main.tf
    ├── output.tf -> ../base/output.tf
    └── variables.tf -> ../base/variables.tf

This approach only really works if your output.tf and variables.tf files are the same for each environment, and although you can have non-symlinked variants (e.g. super_custom above), this can become confusing as it's not immediately obvious which environments are custom and which aren't. YMMV. I try to keep the changes between environments limited to a .tfvars file per environment.

It's worth reading Charity Major's excellent post on tfstate files, which set me on this path.

0
votes

If your dev, uat and prod environments have the same shape, but different properties you could leverage workspaces to separate your environment state, along with separate *.tfvars files to specify the different configurations.

This could look like:

├─ modules
│   ├── x
│   └── y
├── dev.tfvars
├── prod.tfvars
├── uat.tfvars
├── main.tf
├── outputs.tf
└── variables.tf

You can create a new workspace with:

terraform workspace new uat

Then deploying changes becomes:

terraform workspace select uat
terraform apply --var-file=uat.tfvars

The workspaces feature ensures that different environments states are managed separately, which is a bonus.

This approach only works when the differences between the environments are small enough that it makes sense to encapsulate the logic for that in the individual modules (for example, having a high_availability flag which adds some additional redundant infrastructure for uat and prod).