One of the important factors that determines how you should assemble servers, AMIs, and infrastructure planning is to answer the question: In production, how fast will I need a new instance launched?
The answer to this question will determine how much you bake into the AMI vs. how much you build after boot.
NOTE: My experience is with Chef Server so I will use Chef terminology but the concepts are the same for any other configuration management stack.
The general rule of thumb is to treat your "Infrastructure as Code". This means think about the process of launching instances, creating users on that machine, and the process of managing a known_hosts files and SSH keys the same as you would your application code. Being able to track the changes to Infrastructure in source code makes management easier, redeployments, and even CI much easier.
This Chef Introduction covers the terminology in Chef of Cookbooks, Recipes, Resources, and more. It shows you how to build a simple LAMP stack, and how you can relaunch it just as easily with one command.
So given the example in your question, at a high level I would do the following:
- Launch a base Ubuntu Linux AMI (currently 14.04) with a Cloudformation script.
- In the UserData section of the Instance configuration, boot strap the Chef Client Install process.
- Run a Recipe to create a user.
- Run a Recipe to create the known_hosts file for the user
Tools like Chef are used because you are able to break down the infrastructure into small blocks of code performing specific functions. There are numerous Cookbooks already built and available that perform the basic building blocks of creating services, installing software packages, etc.
All that being said, there are some times when you have to deviate from best practices in the interest of your specific domain and requirements. There may be situations where given all the advantages of a infrastructure management you will still need to bake items into the AMI.
Let's pretend your application does image processing and has a requirement to use ImageMagick. Let's assume that you will need to build ImageMagick from source. If you were to do this via Chef Recipes this could add another 7 minutes of just compiling ImageMagick to the normal instance boot time. If waiting 10-12 minutes is too long for a new instance to come online then you may want to consider baking your own AMI that has ImageMagick already compiled and installed.
This is an acceptable solution but you should keep in mind that managing your own fleet of pre-baked AMIs adds additional infrastructure overhead. You will need to keep your custom AMIs updated as new AMIs are released, you expand to different instance types and to different AWS Regions.
user data
defined in CloudFormation. – BMWuser data
? – Saqib Ali