Terraform: state management for multi-tenancy

Question

As we're in progress of evaluating Terraform to replace (partially) our Ansible provisioning process for a multi-tenancy SaaS, we realize the convenience, performance and reliability of Terraform as we can handle the infrastructure change (adding/removing) smoothly, keeping track of infra state (that's very cool).

Our application is a multi-tenancy SaaS which we provision separate instances for our customers - in Ansible we have our own dynamic inventory (quite the same as EC2 dynamic inventory). We go through lots of Terraform books/tutorials and best practices where many suggest that multi environment states should be managed separately & remotely in Terraform, but all of them look like static env (like Dev/Staging/Prod).

Is there any best practice or real example of managing dynamic inventory of states for multi-tenancy apps? We would like to track state of each customer set of instances - populate changes to them easily.

One approach might be we create a directory for each customer and place *.tf scripts inside, which will call to our module hosted somewhere global. State files might be put to S3, this way we can populate changes to each individual customer if needed.

ydaetskcoR ydaetskcoR · Accepted Answer · 2017-04-04T13:46:26

Terraform works on a folder level, pulling in all .tf files (and by default a terraform.tfvars file).

So we do something similar to Anton's answer but do away with some complexity around templating things with sed. So as a basic example your structure might look like this:

$ tree -a --dirsfirst
.
├── components
│   ├── application.tf
│   ├── common.tf
│   ├── global_component1.tf
│   └── global_component2.tf
├── modules
│   ├── module1
│   ├── module2
│   └── module3
├── production
│   ├── customer1
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   ├── customer2
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   └── global
│       ├── common.tf -> ../../components/common.tf
│       ├── global_component1.tf -> ../../components/global_component1.tf
│       ├── global_component2.tf -> ../../components/global_component2.tf
│       └── terraform.tfvars
├── staging
│   ├── customer1
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   ├── customer2
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   └── global
│       ├── common.tf -> ../../components/common.tf
│       ├── global_component1.tf -> ../../components/global_component1.tf
│       └── terraform.tfvars
├── apply.sh
├── destroy.sh
├── plan.sh
└── remote.sh

Here you run your plan/apply/destroy from the root level where the wrapper shell scripts handle things like cd'ing into the directory and running terraform get -update=true but also running terraform init for the folder so you get a unique state file key for S3, allowing you to track state for each folder independently.

The above solution has generic modules that wrap resources to provide a common interface to things (for example our EC2 instances are tagged in a specific way depending on some input variables and also given a private Route53 record) and then "implemented components".

These components contain a bunch of modules/resources that would be applied by Terraform at the same folder. So we might put an ELB, some application servers and a database under application.tf and then symlinking that into a location gives us a single place to control with Terraform. Where we might have some differences in resources for a location then they would be separated off. In the above example you can see that staging/global has a global_component2.tf that isn't present in production. This might be something that is only applied in the non production environments such as some network control to prevent internet access to the environment.

The real benefit here is that everything is easily viewable in source control for developers directly rather than having a templating step that produces the Terraform code you want.

It also helps follow DRY where the only real differences between the environments are in the terraform.tfvars files in the locations and makes it easier to test changes before putting them live as each folder is pretty much the same as the other.

Terraform: state management for multi-tenancy

2 Answers