0
votes

I've spent a number of days looking into putting up two Windows Servers on Amazon, a domain controller and a remote desktop services server but there are a few questions that I can't find detailed or any answers for:

1) When you have an EBS backed instance I assume this means that all files (OS/Applications/Pagefile) etc are all stored on EBS? Physically in the datacentre, lets assume I have 50 gig of OS files/application data etc, are these all stored on just one SAN type device? What happens if that device blows up or say that particular data centre gets destroyed. Is the data elsewhere? What is the probability that your entire EBS volume can just disappear?

2) As I understand it you can backup your EBS instance to S3 with snapshotting. I assume you can choose how often to snapshot (say daily?). In my above scenario if I have 50 gig of files, and snapshot once a day. Over 7 days will my S3 storage be 350 gig or will it be 50 gig + incremental changes I have made over the week?

3) I remember reading somewhere that the instance has to go offline to snapshot. If that is the case does it do this by shutting down the guest OS, snapshotting then booting up or does it just detach the data, prevent you from connecting while it snapshots, then bring it back to the exact moment before it went for a snapshot.

4) I understand the concept of paying per month per gig of space but how I am concerned about the $0.11 per 1 million I/O requests. How does that work when I am running a windows server? I have no idea how many I/O requests a server makes to its disks. I am assuming a lot of the entire VM is being stored on an EBS volume. Is running a server on the standard EBS going to slow it down radically?

5) Are people using the snapshot to S3 as their main backup are are people running other types of backup for Data?

Sorry for the noob questions - I'd appreciate any partial answers, answers or advice anyone could offer me. Thanks in advance!

1
Hi I cant answer answer all your questions however to answer point 1. Each Availability zone is made up of multiple data centres. When you create an EBS volume, Amazon have a concept of Eventually Consistent. This means that over time that volume will be replicated amongst all Data centres within that zone. So if one did blow up the data would still be recoverable and not totally lost. At least thats my understanding.SCB

1 Answers

1
votes

1) amazon is fuzzy on this. They say that data is replicated within the AZ it belongs to and that if you have less than 20GB of data changed since the last snapshot your annual failure rate is ~ 0.1-0.4%

2) snapshots are triggered manually, and are done incrementally

3) Depends on your filesystem. For example on a linux box with an xfs volume you can freeze IO to the volume, do your snapshot (takes only a second or so) and then unfreeze. If you take a snapshot without doing something similar you run the risk of the data being in an inconsistent state. This will depend on your filesystem

4) I run all my instances on EBS. You probably wouldn't want your pagefile on EBS, it would make more sense to use instance storage for that. The amount of IOs you use will be very dependant on the workload. The IO count depends heavily on your workload - an application server does a lot less IOPs than a database server for example. You're unlikely to use more than a few dollars a month per volume if you're running particularly IO heavy operations

5) Personally I don't care about the installed software/configuration (I have AMIs with that all setup so I can restore that in minutes), I only care about the data. I back that data up separately (S3 & Glacier). Partly that's because I was bitten by a bug EBS had about a year ago or so where they lost some snapshots

You also use multiple strategies, as Fantius commented. For example on the mongodb servers I run the boot volume is small (and never snapshotted or backed up since it can be restored automatically from an AMI), with a separate data volume containing the actual mongodb data. The mongodb volume is snapshotted as well as storing dumps on S3. Snapshots are an efficient way of creating backups (since you're only storing incremental changes) however you can't transfer them out of your EC2 region, whereas a tarball on S3 can easily be copied anywhere.