0
votes

Good day,

I'm provisioning VMs from a template on vSphere with Terraform. Once the VMs come up, a file provisioner copies local static content (pictures). Once that's done, a remote-exec provisioner does this:

provisioner "remote-exec" {
     inline = [
       "mkdir /home/foo/static",
       "mv /home/foo/logo.jpg /home/foo/static/",
       "echo 'python /home/foo/app.py 2>&1 &' > /home/foo/start_app.sh",
       "chmod u+x /home/foo/start_app.sh",
       "/home/foo/start_app.sh 2>&1",
       "sleep 60"
     ]
}

app.py is a Python Flask project. The code starts and serves content fine .. for roughly 60 seconds. When my sleep timer expires, I suspect the shell spawned by Terraform dies and so does app.py. I tried launching start_app.sh in the background, I tried sudoing it (foreground and background) to no avail. Same behavior. If I start python /home/foo/app.py directly inside the remote-exec block rather than invoking start_app.sh, then TF never exits the shell and my Jenkins build keeps spinning forever.

I don't think it makes any difference, but to be complete my TF VMware plan is applied when Git sends a webhook to Jenkins. Jenkins invokes the TF plan as part of a pipeline stage. Here's the console output from Jenkins:

vsphere_virtual_machine.vm[1]: Provisioning with 'file'...
vsphere_virtual_machine.vm[1]: Provisioning with 'remote-exec'...
vsphere_virtual_machine.vm[1] (remote-exec): Connecting to remote host via SSH...
vsphere_virtual_machine.vm[1] (remote-exec):   Host: 1.1.1.200
vsphere_virtual_machine.vm[1] (remote-exec):   User: foo
vsphere_virtual_machine.vm[1] (remote-exec):   Password: true
vsphere_virtual_machine.vm[1] (remote-exec):   Private key: false
vsphere_virtual_machine.vm[1] (remote-exec):   SSH Agent: false
vsphere_virtual_machine.vm[1] (remote-exec):   Checking Host Key: false
vsphere_virtual_machine.vm[1] (remote-exec): Connected!
vsphere_virtual_machine.vm[1] (remote-exec):  * Serving Flask app "app" (lazy loading)
vsphere_virtual_machine.vm[1] (remote-exec):  * Environment: production
vsphere_virtual_machine.vm[1] (remote-exec):    WARNING: This is a development server. Do not use it in a production deployment.
vsphere_virtual_machine.vm[1] (remote-exec): [2m   Use a production WSGI server inst
vsphere_virtual_machine.vm[1] (remote-exec):  * Debug mode: off
vsphere_virtual_machine.vm[1] (remote-exec):  * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)
vsphere_virtual_machine.vm.1: Still creating... (8m30s elapsed)
vsphere_virtual_machine.vm.0: Still creating... (8m30s elapsed)
vsphere_virtual_machine.vm.1: Still creating... (8m40s elapsed)
vsphere_virtual_machine.vm.0: Still creating... (8m40s elapsed)
vsphere_virtual_machine.vm.1: Still creating... (8m50s elapsed)
vsphere_virtual_machine.vm.0: Still creating... (8m50s elapsed)
vsphere_virtual_machine.vm.1: Still creating... (9m0s elapsed)
vsphere_virtual_machine.vm.0: Still creating... (9m0s elapsed)
vsphere_virtual_machine.vm.1: Still creating... (9m10s elapsed)
vsphere_virtual_machine.vm.0: Still creating... (9m10s elapsed)
vsphere_virtual_machine.vm[0]: Creation complete after 9m15s (ID: 4228d941-a19c-361d-073a-4441cde5973e)

How can I keep the shell spawned by TF persistent after TF's remote-exec is done?

1
Don't do this. You should be running your application as a service using your operating system's service mechanism. In most modern Linux distros this will be a systemd unit file. You should also consider baking these remote-exec steps into the image using something like Packer so you can drop the remote-exec altogether. - ydaetskcoR
I am/was trying to keep my VM templates as clean as possible and run code at instantiation time. That way, I don't have a template for a web server and another for a database server. With EC2/Azure VMs I'd use Cloud Init rather than AMIs/galleries generated by Packer. - Christopher Paggen
Even if you want to do that you're still going to want to be running the application as a service. This could be configured entirely via the remote-exec shell script but I'd recommend running something like Ansible/Chef etc if you really want to configure on the fly. But there really shouldn't be a major issue with just creating more templates/images to have hyper-specialised images. I personally build a base AMI (I run in AWS) and then create AMIs on top of that for anything special (I then also run applications in Docker via ECS but that's not necessary if you don't need it). - ydaetskcoR
@ydaetskcoR - thanks, I think that makes sense. I found one heck of an ugly hack but this is just for a lab demo, I'll answer my own question below. - Christopher Paggen

1 Answers

0
votes

OK - I found a really terrible hack. Terraform doesn't run remote-exec anymore, it just pushes my python Flask app and its associated static content using a file provisioner. Inside Jenkins, I have a stage that looks like this:

def remote = [:]
remote.name = "1.1.1.200"
remote.host = "1.1.1.200"
remote.allowAnyHosts = true

node {
    withCredentials([usernamePassword(credentialsId: 'sshUserAccount', passwordVariable: 'password', usernameVariable: 'userName')]) {
        remote.user = userName
        remote.password = password
        stage("SSH Steps Rocks!") {
            try {
              timeout(time: 1, unit: 'MINUTES') {
                sshCommand remote: remote, command: '/home/foo/start_app.sh &' 
              }
            } catch (err){
               currentBuild.result = 'SUCCESS'
              }
        }
   }
}

Once the Jenkins SSH thread exits after one minute, my python app keeps running on the remote machine. Yay! It's just for a lab demo, not something I'd consider doing in prod. I'd use custom templates as suggested.