0
votes

I am having trouble fully understanding the differences between EBS Snapshots and AMI's in AWS. I understand how to create them and the terminology (Snapshots are backups of Volumes or the disk attached to an EC2 Instance and AMI is the backup of the full EC2 instance with snapshots at given time). But, I'm not sure what exactly is saved in the snapshot.

Assume I have an EC2 instance with LAMP stack installed on it and some data from the database. I understand the snapshot will save all my data and website files. When I create the new EC2 instance and attach the volume backed by my snapshot, do I then have to install Apache, MySQL, and PHP again? Or is that all saved in the snapshot? I am not worried about having to redo the security settings, instance type, etc for the new EC2 instance, but am worried all the configuration files are lost unless I have an AMI.

1

1 Answers

3
votes

I understand how to create them and the terminology (Snapshots are backups of Volumes or the disk attached to an EC2 Instance and AMI is the backup of the full EC2 instance with snapshots at given time)

This is kinda correct, but not exactly. A better understanding of the concepts may help you in general, so here it goes.

A snapshot represents the state of an EBS volume at an exact point in time, taken atomically, that you can use to create other EBS volumes from. Those new EBS volumes, created from the snapshot, will start with the exact same content as the original EBS volume at the time the snapshot was taken. So, kind of a "backup of volumes").

One extremely important thing about Snapshots, though, that you seem to be missing based on the rest of your question, is that Snapshots are immutable. Once create, the contents of a snapshot cannot change, ever.

An AMI is an Image — it's used to create an instance from. It's a bunch of metadata, which includes the type of hypervisor required, accounts allowed to use it, and ultimately it contains a list of EBS Snapshots that should be used to create volumes, and where exactly to attach the volumes created from those snapshots.

So, while "creating an AMI" is indeed a strategy some people use to create a "backup of the full EC2 instance", it's not exactly the same thing.

When I create the new EC2 instance and attach the volume backed by my snapshot [...]

I think you're confusing some concepts here.

When a new EBS volume is created from a snapshot (either by you creating the EBS volume directly and selecting a snapshot, or by using an AMI, which implicitly means that EBS volumes are going to be created from the Snapshots as specified in the metadata that the AMI represents), there's no relationship between that snapshot and the new volume anymore. Any changes made to the volume are completely independent of the original snapshot. Remember: snapshots are immutable.

So, this notion of "the volume backed by my snapshot" may be quite unhelpful: there's no link between them.

Hoping this is clear, let's move on...

When I create the new EC2 instance [...], do I then have to install Apache, MySQL, and PHP again? Or is that all saved in the snapshot?

The software that will come pre-installed in the EC2 instance is defined by the contents of the snapshots used when the EBS volumes attached to the instance were created.

In other words, if you create an EC2 instance from a standard AMI (not one that you create) that doesn't come with the software pre-installed, then the software won't be installed.

If you create an AMI from your instance before you install the software you want, then the AMI (or, more precisely, the snapshots referenced by the AMI) won't have the software, so any instances created from that AMI won't come with the software.

Now, here's what you are probably looking for:

If you (1) create an instance, then (2) install the software, and only then (3) you create an AMI, in this order, then any new instances created from that AMI will come with the software pre-installed.

The easiest way to fully and truly understand this is to remember that (A) Snapshots are immutable, and (B) AMIs are immutable references to Snapshots.


All that said, while creating a custom AMI is indeed a completely valid and popular approach to the problem you are dealing with, there's another approach that's also completely valid and popular, but that is more flexible, and could be worth investigating.

Instead of having to create and manage AMIs (*), what you could do is use a user data script.

Essentially, what it does is it allows you to have a script, that will execute as root, on the first boot of a newly created instance. This script is often used to install software, update packages, download configuration files, etc.

This way, if you need to change versions of software, or change configuration files, etc, you don't need to go through all the complexity of managing AMIs. Instead, you simply update the script.

For more info on user data, check this out: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html

(*) "manage AMIs": remember that an AMI is an immutable set of metadata which includes reference to Snapshots, which are also immutable. So if you ever need to change some settings, or have new versions of the software be installed, you will need to create new AMIs. This could become something quite complicated and time-consuming. In fact, there's a ton of tools out there, some by AWS, some by other companies, that try to simplify the process of creating and managing AMIs. So just be aware that, while doable, valid, and popular, it's an approach that can get messy!