11
votes

As we're in progress of evaluating Terraform to replace (partially) our Ansible provisioning process for a multi-tenancy SaaS, we realize the convenience, performance and reliability of Terraform as we can handle the infrastructure change (adding/removing) smoothly, keeping track of infra state (that's very cool).

Our application is a multi-tenancy SaaS which we provision separate instances for our customers - in Ansible we have our own dynamic inventory (quite the same as EC2 dynamic inventory). We go through lots of Terraform books/tutorials and best practices where many suggest that multi environment states should be managed separately & remotely in Terraform, but all of them look like static env (like Dev/Staging/Prod).

Is there any best practice or real example of managing dynamic inventory of states for multi-tenancy apps? We would like to track state of each customer set of instances - populate changes to them easily.

One approach might be we create a directory for each customer and place *.tf scripts inside, which will call to our module hosted somewhere global. State files might be put to S3, this way we can populate changes to each individual customer if needed.

2

2 Answers

10
votes

Terraform works on a folder level, pulling in all .tf files (and by default a terraform.tfvars file).

So we do something similar to Anton's answer but do away with some complexity around templating things with sed. So as a basic example your structure might look like this:

$ tree -a --dirsfirst
.
├── components
│   ├── application.tf
│   ├── common.tf
│   ├── global_component1.tf
│   └── global_component2.tf
├── modules
│   ├── module1
│   ├── module2
│   └── module3
├── production
│   ├── customer1
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   ├── customer2
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   └── global
│       ├── common.tf -> ../../components/common.tf
│       ├── global_component1.tf -> ../../components/global_component1.tf
│       ├── global_component2.tf -> ../../components/global_component2.tf
│       └── terraform.tfvars
├── staging
│   ├── customer1
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   ├── customer2
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   └── global
│       ├── common.tf -> ../../components/common.tf
│       ├── global_component1.tf -> ../../components/global_component1.tf
│       └── terraform.tfvars
├── apply.sh
├── destroy.sh
├── plan.sh
└── remote.sh

Here you run your plan/apply/destroy from the root level where the wrapper shell scripts handle things like cd'ing into the directory and running terraform get -update=true but also running terraform init for the folder so you get a unique state file key for S3, allowing you to track state for each folder independently.

The above solution has generic modules that wrap resources to provide a common interface to things (for example our EC2 instances are tagged in a specific way depending on some input variables and also given a private Route53 record) and then "implemented components".

These components contain a bunch of modules/resources that would be applied by Terraform at the same folder. So we might put an ELB, some application servers and a database under application.tf and then symlinking that into a location gives us a single place to control with Terraform. Where we might have some differences in resources for a location then they would be separated off. In the above example you can see that staging/global has a global_component2.tf that isn't present in production. This might be something that is only applied in the non production environments such as some network control to prevent internet access to the environment.

The real benefit here is that everything is easily viewable in source control for developers directly rather than having a templating step that produces the Terraform code you want.

It also helps follow DRY where the only real differences between the environments are in the terraform.tfvars files in the locations and makes it easier to test changes before putting them live as each folder is pretty much the same as the other.

2
votes

Your suggested approach sounds right to me, but there are few more things which you may consider doing.

Keep original Terraform templates (_template in the tree below) as versioned artifact (git repo, for eg) and just pass key-values properties to be able to recreate your infrastructure. This way you will have very small amount of copy pasted Terraform configuration code laying around in directories.

This is how it looks:

/tf-infra
├── _global
│   └── global
│       ├── README.md
│       ├── main.tf
│       ├── outputs.tf
│       ├── terraform.tfvars
│       └── variables.tf
└── staging
    └── eu-west-1
        ├── saas
        │   ├── _template
        │   │   └── dynamic.tf.tpl
        │   ├── customer1
        │   │   ├── auto-generated.tf
        │   │   └── terraform.tfvars
        │   ├── customer2
        │   │   ├── auto-generated.tf
        │   │   └── terraform.tfvars
...

Two helper scripts are needed:

  1. Template rendering. Use either sed to generate module's source attribute or use more powerful tool (as for example it is done in airbnb/streamalert )

  2. Wrapper script. Run terraform -var-file=... is usually enough.

Shared terraform state files as well resources which should be global (directory _global above) can be stored on S3, so that other layers can access them.

PS: I am very much open for comments on the proposed solution, because this is an interesting task to work on :)