0
votes

I'm using Terraform v0.11.7 and AWS provider 1.30 to build an environment to run load tests with locust built on Debian 9.5 AMI.

My module exposes a num_instances var used to determine the locust command line used. Below is my configuration.

resource "aws_instance" "locust_master" {
  count                   = 1

  ami                     = "${var.instance_ami}"
  instance_type           = "${var.instance_type}"
  key_name                = "${var.instance_ssh_key}"
  subnet_id               = "${var.subnet}"
  tags                    = "${local.tags}"
  vpc_security_group_ids  = ["${local.vpc_security_group_ids}"]

  user_data = <<-EOF
              #!/bin/bash
              # Install pip on instance.
              curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
              sudo python3 get-pip.py
              rm get-pip.py
              # Install locust and pyzmq on instance.
              sudo pip3 install locustio pyzmq
              # Write locustfile to instance.
              echo "${data.local_file.locustfile.content}" > ${local.locustfile_py}
              # Write locust start script to instance.
              echo "nohup ${var.num_instances > 1 ? local.locust_master_cmd : local.locust_base_cmd} &" > ${local.start_sh}
              # Start locust.
              sh ${local.start_sh}
              EOF
}

resource "aws_instance" "locust_slave" {
  count                   = "${var.num_instances - 1}"

  ami                     = "${var.instance_ami}"
  instance_type           = "${var.instance_type}"
  key_name                = "${var.instance_ssh_key}"
  subnet_id               = "${var.subnet}"
  tags                    = "${local.tags}"
  vpc_security_group_ids  = ["${local.vpc_security_group_ids}"]

  user_data = <<-EOF
              #!/bin/bash
              set -x
              # Install pip on instance.
              curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
              sudo python3 get-pip.py
              rm get-pip.py
              # Install locust and pyzmq on instance.
              sudo pip3 install locustio pyzmq
              # Write locustfile to instance.
              echo "${data.local_file.locustfile.content}" > ${local.locustfile_py}
              # Write locust master dns name to instance.
              echo ${aws_instance.locust_master.private_dns} > ${local.locust_master_host_file}
              # Write locust start script to instance.
              echo "nohup ${local.locust_slave_cmd} &" > ${local.start_sh}
              # Start locust.
              sh ${local.start_sh}
              EOF
}

If I SSH into the locust_master instance after it has been launched, I see the /home/admin/start.sh script, but it does not appear to have been run, as I do not see the nohup.out file and locust is not in my running processes. If I manually run the same sh /home/admin/start.sh script on that host, the service starts, and I can disconnect from the host and still access it. The same problem is exhibited on the locust_slave host(s).

What might cause running the start.sh in aws_instance user_data to fail? Are there any gotchas I should be aware of when executing scripts in user_data?

Many thanks in advance!

1
Does /var/log/cloud-init-output.log have anything useful in it?ydaetskcoR

1 Answers

0
votes

Thanks for the tip! I was not aware of that log file, and it did point it out. It was a relative path issue. I assumed that user_data commands would be executed with /home/admin as the working directory, so locust couldn't find the locustfile.py file. Using absolute path to locustfile.py solved the problem.