140
votes

We currently use multiple webservers accessing one mysql server and fileserver. Looking at moving to the cloud, can I use this same setup and attach the EBS to multiple machine instances or what's another solution?

11
The answer given is no. But the reasons cited are bugging me. Everybody goes like "it would be like mounting the same hard drive to multiple computers" -- and I go "and so what?" With SCSI or a FibreChannel SAN we do that all the time, makes total sense. E.g. for read only mount of multiple servers on the same read only data. Oracle and other big RDBMs are designed to run in cluster mode, where multiple servers use the same physical storage. It can be much faster. EBS/NFS is veeeery slow, not an option. But mind you, even if you could attach to multiple EC2, your IOPS would still be capped.Gunther Schadow
As of Feb 2020, you can attach certain types of EBS to multiple EC2 instances - aws.amazon.com/blogs/aws/…Koshur

11 Answers

137
votes

UPDATE (April 2015): For this use-case, you should start looking at the new Amazon Elastic File System (EFS), which is designed to be multiply attached in exactly the way you are wanting. The key difference between EFS and EBS is that they provide different abstractions: EFS exposes the NFSv4 protocol, whereas EBS provides raw block IO access.

Below you'll find my original explanation as to why it's not possible to safely mount a raw block device on multiple machines.


ORIGINAL POST (2011):

Even if you were able to get an EBS volume attached to more than one instance, it would be a _REALLY_BAD_IDEA_. To quote Kekoa, "this is like using a hard drive in two computers at once"

Why is this a bad idea? ... The reason you can't attach a volume to more than one instance is that EBS provides a "block storage" abstraction upon which customers run a filesystem like ext2/ext3/etc. Most of these filesystems (eg, ext2/3, FAT, NTFS, etc) are written assuming they have exclusive access to the block device. Two instances accessing the same filesystem would almost certainly end in tears and data corruption.

In other words, double mounting an EBS volume would only work if you were running a cluster filesystem that is designed to share a block device between multiple machines. Furthermore, even this wouldn't be enough. EBS would need to be tested for this scenario and to ensure that it provides the same consistency guarantees as other shared block device solutions ... ie, that blocks aren't cached at intermediate non-shared levels like the Dom0 kernel, Xen layer, and DomU kernel. And then there's the performance considerations of synchronizing blocks between multiple clients - most of the clustered filesystems are designed to work on high speed dedicated SANs, not a best-effort commodity ethernet. It sounds so simple, but what you are asking for is a very nontrivial thing.

Alternatively, see if your data sharing scenario can be NFS, SMB/CIFS, SimpleDB, or S3. These solutions all use higher layer protocols that are intended to share files without having a shared block device subsystem. Many times such a solution is actually more efficient.

In your case, you can still have a single MySql instance / fileserver that is accessed by multiple web front-ends. That fileserver could then store it's data on an EBS volume, allowing you to take nightly snapshot backups. If the instance running the fileserver is lost, you can detach the EBS volume and reattach it to a new fileserver instance and be back up and running in minutes.

"Is there anything like S3 as a filesystem?" - yes and no. Yes, there are 3rd party solutions like s3fs that work "ok", but under the hood they still have to make relatively expensive web service calls for each read / write. For a shared tools dir, works great. For the kind of clustered FS usage you see in the HPC world, not a chance. To do better, you'd need a new service that provides a binary connection-oriented protocol, like NFS. Offering such a multi-mounted filesystem with reasonable performance and behavior would be a GREAT feature add-on for EC2. I've long been an advocate for Amazon to build something like that.

85
votes

Update (2020) It is now possible!

This is possible now with the newest instance types running in AWS Nitro within the same Availability Zone. There are some caveats but this is great for certain use cases that need the speed of EBS and where EFS isn't feasible.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volumes-multi.html


Original Post (2009)

No, this is like using a hard drive in two computers.

If you want shared data, you can setup a server that all your instances can access. If you are wanting a simple storage area for all your instances, you can use Amazon's S3 storage service to store data that is distributed and scalable.

Moving to the cloud, you can have the exact same setup, but you can possibly replace the fileserver with S3, or have all your instances connect to your fileserver.

You have a lot of options, but sharing a hard drive between instances is probably not the best option.

14
votes

No, according to the EBS docs: "A volume can only be attached to one instance at a time".

How are you using the shared storage currently? If it's just for serving files from the fileserver, have you considered setting up a system so that you could proxy certain requests to a process on the fileserver rather than having the webservers serve those files?

5
votes

Multiple Web Servers accessing MySQL Server & File Server is normal in AWS. Some of the best practices to be followed for the above mentioned architecture are:

Point 1) MySQL on EC2 can be setup as Master-Slave in async/semi sync mode in AWS. EBS-OPT+PIOPS in RAID 0 is recommended for High performance DB

Point 2) Alternatively you can use Amazon RDS + Multi-AZ mode. For read scaling Multiple RDS Read Replica's can be attached to MySQL RDS.

Point 3) EBS Volume cannot be attached to Multiple EC2's simultaneously. You can create File server based on GlusterFS on Amazon EC2 using EBS. Multiple Web Servers can talk to single GlusterFS simultaneously on AWS infra.

Point 4)In case your application can be integrated with S3 as file store, then it is preferred because of the stability it brings in to the architecture. You can also access S3 using tools like S3fuse as well from your application.

4
votes

I'm fairly sure you can't, but you can clone an EBS and attach it to another instance.

This is useful for fixed datasets, or for testing on 'real' data but doesn't allow more than 1 instance to operate on a single block store

2
votes

There is something in the IT world known as Clustered Filesystem, Redhat GFS, Oracle OCFS2, Veritas CFS...

2
votes

The short answer is a categorical "No". Others have said it above.

Those who said "yes" did not answer the question, but a different question. If EFS is just an NFS service, then it isn't the answer to the question as originally stated. And it doesn't matter if EFS is "rolled out in all zones" or not, because you can do your own NFS instance quite, and have multiple servers mount NFS. That isn't anything new, we've done that in 1992 already. SMB and sshfs, all of these are just ways to mount drives as a remote file system.

Those who said "why would you want to do that" or "it will all end in tears" are wrong. We have been mounting multiple disks to multiple servers for decades. If you ever worked with a SAN (Storage Area Network) the ability to attach the same device to multiple nodes usually through FibreChannel SAN is completely normal. So anyone who has run servers a decade ago before the virtualization / cloud servers became ubiquitous has some exposure to that.

Soon there were clustered file systems where two systems could read and write to the exact same volume. I believe this started with the VAX and Alpha VMS time in history already. Clustered file systems use a distributed mutual exclusion scheme to be able to manipulate blocks directly.

The advantage of mounting the same disk to multiple nodes is speed and reducing single points of failures.

Now, clustered file systems have not become hugely popular in the "consumer" hosting business, that is true. And they are complicated and have some pitfalls. But you don't even need a clustered file system to make use of a disk attached to multiple compute nodes. What if you want a read-only drive? You don't even need a clustered file system! You just put into your /etc/fstab the same physical device as read only (ro). Then you mount to 2 or 10 EC2 servers and all of them can read directly from that device!

There is an obvious use case for this in the world of cloud servers when building rapidly scaling farms. You can have your main system disk all prepared and use just a very small boot and configuration disk for each of the servers. You can even have all of them boot from the same boot disk, and right before the remount of / in read-write mode, you can insert a Union-FS with 3 layers:

  1. The main read-only system disk, with boot, kernel, and userland installation
  2. A configuration file system, with only the few files (mostly in /etc) that are specific to the individual server. This disk can be written out by another server to prepare for booting a new instance. Example files here would be /etc/hostname and just very few configuration files that need to remain different per node.
  3. The writable disk, which you may not need at all, could be just /tmp as a memory file system.

So, yes the question made a lot of sense, and unfortunately the answer is (still) "No". And no NFS is not a great replacement for that use case as it penalizes all read activity from the system disk. However, network boot from an NFS system disk is the only alternative to implementing the use case I described above. Unfortunately, since setting up network boot agent and NFS is much trickier than just accessing the same physical block device.

PS: I would have liked to submit a shorter version of this as comments, but I cannot because of the silly 51 credit points threshold, so I have to write an answer with the same essential "No" but to include my point why this is a relevant question that has not been receiving a deserved answer.

PPS: I just found someone over at StackExchange mention iSCSI. iSCSI is somewhat like NFS, but logically like a FibreChannel SAN. You get to access (and share) physical block devices. It would make boot disk sharing easier so you don't need to set up the bootd network booting which can be finicky. But then on AWS, there is no network booting available either.

1
votes

Although EBS has previously only allowed a single EC2 instance to be attached to a given volume, multi-attach is now possible, at least for io1 volumes. For more information, see this AWS blog post.

0
votes

Why won't you create one instance with volume and sshfs to that volume in other instances?

-3
votes

You can totally use one drive on multiple servers in AWS. I use sshfs to mount an external drive and share it with multiple servers in EC2.

The reason I needed to connect a single drive to multiple servers is to have a single place to put all my backups before pulling them down local.