AWS AutoScaling Instance Cloning git

Question

I am in the process of setting up an auto scaling group in AWS with a custom AMI. I know that AWS reserves a section for "user-data" where people can input their scripts and they can be executed at instance creation but I guess I am more comfortable having a start up script baked in to the image itself.

I will create the custom image (set up web server and all additional packages needed) and then need to create a start up script that will clone my git repository (ssh) into "/var/www/" for example.

My question would be: Any downside to cloning a repository directly into a web server folder? The whole idea is: when the load gets too high on the balancer, then a new instance will be created from the AMI, and at start up, the instance will grab the source code from the private git repository. --> Any advice for the best way to carry this process out?? I would appreciate some guidance!

As for deployment of new code to already running instances, I will be using Capistrano for that.

Thanks in advance!

Marcus Walser Marcus Walser · Accepted Answer · 2015-03-07T00:12:17

There are a couple of downsides, yes:

If the repo is particularly large, this will take a while - you're pulling down a lot of data you don't need on each autoscaling bootstrap (You can use the git clone --depth=1 to try to minimize the data you download).
You'll want to run your own separate remote for this, since relying on a third party to be able to deploy code is not great - you don't want a dependency on github in your deployment flow.
Your deployment artifacts won't be immutable, since it's possible to edit tags/rebase commits out of existence/whatever.

An alternative approach would be to use something like fpm to build a deployment artifact, store it in S3 as part of your build, and then use capistrano-artifact to have the servers grab the artifact from S3 on boot. Naively, you could also just tar up a particular revision and stuff that in S3. This has the added benefit of being super-quick to download. You lose a bit of flexibility rolling back, though - something like slugforge might help with that by leaving previous vehicles on the box and having explicit support for symlink-based rollback.

As an aside, I think you're smart not to put too much logic into the userdata - you want that as simple as possible, because changing it means building an entirely new launch configuration. Rather than having the bootstrap on the AMI, I suggest storing it in S3 and using runurl to execute it from the userdata script.

AWS AutoScaling Instance Cloning git

2 Answers