5
votes

We are trying to use terraform with a remote state stored in S3.

The projects are being broken such as there is the “main” VPC project, which creates the network infra only (vpc, subnets, IGW, NAT, routes etc.), and the sub-projects, creating specific resources on top of the main vpc(subnets), i.e. ec2 nodes.

Project folders/files:

.
├── modules/
│   └── mod-vpc/
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf
├── projects/
│   └── top-level-project-name-goes-here/
│       ├── env-dev/
│       │   ├── globals.tf
│       │   ├── test/
│       │   │   ├── main.tf
│       │   │   └── variables.tf
│       │   └── vpc/
│       │       ├── main.tf
│       │       └── variables.tf
│       └── env-prod/
└── terraform.tfvars

Other than VPC project, all other projects use the vpc_id, CIDR etc. from remote state of VPC. Here is how our process is defined:

Step 1: Create VPC.

No issues here, the VPC gets created, and outputs are printed out, and stored to S3 bucket:

$ terraform init -backend=s3 -backend-config="region= us-west-2" -backend-config="bucket=xxx"  -backend-config="key=xxx" -backend-config="acl=bucket-owner-full-control" $project_path
$ terraform remote pull
$ terraform get $project_path
$ terraform apply

Outputs:

cidr_block = 10.198.0.0/16
private_subnet_ids = subnet-d3f5029a,subnet-fbeb369c,subnet-7ad88622
public_subnet_ids = subnet-54f5021d
region = us-west-2
vpc_id = vpc-b31ca3d4
vpc_name = main_vpc

Step 2: Create other resource groups: Using the output values from VPC remote state, trying to deploy ec2 nodes to already provisioned public subnet(s) (output of VPC project from step 1 above). Here are the steps/commands our script runs (first we copy all files to a /tmp/project/ working folder, and script is executed in this folder):

$ terraform init -backend=s3 -backend-config="region= us-west-2" -backend-config="bucket=xxx"  -backend-config="key=xxx" -backend-config="acl=bucket-owner-full-control" $project_path
$ terraform remote pull
$ terraform get $project_path
$ terraform apply

/tmp/project/ folder content:

Here is how the project file structure looks like (in /tmp/project/ folder):

├── .terraform
│   ├── modules
│   │   ├── 7d29d4ce6c4f98d8bcaa8b3c0ca4f8f1 -> /pathto/modules/mod-cassandra
│   │   └── aa8ffe05b5d08913f821fdb23ccdfd95
│   └── terraform.tfstate
├── globals.tf
├── main.tf
├── terraform.tfvars
└── variables.tf

Here is how the main.tf file looks like for this project:

resource "aws_instance" "test" {
  instance_type = "${var.instance_type}"
  ami = "${var.ami}"
  subnet_id = "${data.terraform_remote_state.vpc_main.public_subnet_ids}" 
  vpc_security_group_ids = ["${aws_security_group.http_ext.id}"]    
}

Here is the definition for the above data.terraform_remote_state resource:

data "terraform_remote_state" "vpc_main" {
  backend = "s3"
  config {
    region = "us-west-2"
    bucket = "xxx"
    key    = "xxx/vpc.json"
  }
}

Based on where (which file) we declare the “data.terraform_remote_state.vpc_main” resource we are getting different results:

Option 1. If we have the “data.terraform_remote_state” declared in the same file within the “test” project (=main.tf), everything gets executed successfully.

Option 2. If we move the data.terraform_remote_state.vpc_main to a separate file (=”globals.tf”), we get this error during the execution of [terraform get $project_path] step:

$ terraform init -backend=s3 -backend-config="region= us-west-2" -backend-config="bucket=xxx"  -backend-config="key=xxx" -backend-config="acl=bucket-owner-full-control" $project_path
$ terraform remote pull
$ terraform get $project_path

Error loading Terraform: module root: 4 error(s) occurred:

* module 'cassandra': unknown resource 'data.terraform_remote_state.vpc_main' referenced in variable data.terraform_remote_state.vpc_main.cidr_block
* module 'cassandra': unknown resource 'data.terraform_remote_state.vpc_main' referenced in variable data.terraform_remote_state.vpc_main.region
* module 'cassandra': unknown resource 'data.terraform_remote_state.vpc_main' referenced in variable data.terraform_remote_state.vpc_main.vpc_id
* module 'cassandra': unknown resource 'data.terraform_remote_state.vpc_main' referenced in variable data.terraform_remote_state.vpc_main.public_subnet_ids

Which points out that Terraform for some reason was not able to resolve this data.terraform_remote_state.vpc_main resource.

Option 3. But when for testing purposes we enable both declarations (in the “globals.tf” and in the “main.tf”) we get this error during the execution of [terraform apply] step:

$ terraform init -backend=s3 -backend-config="region= us-west-2" -backend-config="bucket=xxx"  -backend-config="key=xxx" -backend-config="acl=bucket-owner-full-control" $project_path
$ terraform remote pull
$ terraform get $project_path
$ terraform apply

module root: 1 error(s) occurred:
2017/01/14 14:02:50 [DEBUG] plugin: waiting for all plugin processes to complete...

•   data.terraform_remote_state.vpc_main: resource repeated multiple times

Which is a valid error, as we do have that same resource defined in two places now.

But why Terraform was not able to resolve this resource properly, when we tried to put that into a separate file under Option 2 above?

Per terraform documentation all *.tf files are loaded and appended in alphabetical order, and the resource declaration order does not matter, as terraform configurations are declarative:

https://www.terraform.io/docs/configuration/load.html

Which seems not to be the case above.

We could go with “hardcoded” approach here, but is there a “legitimate” way in Terraform to make this work?

2
In "Step 1: Create VPC." which directory are you in? Is it projects/top-level-project-name-goes-here/env-dev/vpc/? and is module "cassandra" in projects/top-level-project-name-goes-here/env-dev/test ?Alex Rudd
On any step our wrapper shell script copies the content of the current "target" project (be it VPC or Cassandra or other) to the tmp/project folder, and that becomes the working directory, for VPC it is something like: step1: a) cp projects/.../vpc/*.* /tmp/project/ and then b) cd /tmp/projects, after which we first get the terraform remote state(s), and then execute the appropriate terraform plan|apply|destroy. So we start at the top, but working folder is the /tmp/project.gevgev
I'd check the key for your bucket in your terraform_remote_state data source. At least for our stuff the filename is 'terraform.tfstate' and not something like "<whatever>.json". Also, just to be sure, key and bucket are different things, I'm guessing you're just putting 'xxx' in there to not give us the key and bucket names.jpancoast

2 Answers

1
votes

Try use that commands for set remote state:

terraform_bucket_region='eu-west-1'
terraform_bucket_name='xxx'
terraform_file_name="terraform.tfstate"

export AWS_ACCESS_KEY_ID="xxx"
export AWS_SECRET_ACCESS_KEY="xxx"

[ -d .terraform ] && rm -rf .terraform
[ -f terraform.tfstate.backup ] && rm terraform.tfstate.backup
terraform remote config -backend=S3 -backend-config="region=${terraform_bucket_region}" -backend-config="bucket=${terraform_bucket_name}" -backend-config="key=${terraform_file_name}"
terraform get

I've set this up as a shell script called set-remote-tf.sh.

1
votes

I being using terraform remote state for a while. I think your problem is a problem of organization on dependencies for terraform states.

you should run terraform for each folder. and have a config.tf for each too.

.
├── modules/
│   └── mod-vpc/
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf
├── projects/
│   └── top-level-project-name-goes-here/
│       ├── env-dev/
│       │   ├── globals.tf
│       │   ├── test/
|       |   |   |-- config.tf
│       │   │   ├── main.tf
│       │   │   └── variables.tf
|       |   |   |-- terraform.tfvars
│       │   └── vpc/
|       |       |-- config.tf
│       │       ├── main.tf
│       │       └── variables.tf
|       |       |-- terraform.tfvars
│       └── env-prod/

# ../vpc/config.tf
terraform {
  backend "s3" {
    bucket = "my-infrastructure"
    prefix = "vpc"
  }
}
# ../test
terraform {
  backend "s3" {
    bucket = "my-infrastructure"
    prefix = "test"
  }
}

data "terraform_remote_state" "vpc_main" {
  backend   = "s3"
  # workspace = "${terraform.workspace}" // optional

  config {
    bucket = "my-infrastructure"
    prefix = "vpc"
  }
}

data "terraform_remote_state" "other_terraform_state" {
  backend   = "s3"
  workspace = "${terraform.workspace}"

  config {
    bucket = "my-infrastructure"
    prefix = "other_terraform_state"
  }
}

you can check an GCP example here https://github.com/abgm/gcp-terraform-example/tree/first-example