We currently use multiple webservers accessing one mysql server and fileserver. Looking at moving to the cloud, can I use this same setup and attach the EBS to multiple machine instances or what's another solution?
11 Answers
UPDATE (April 2015): For this use-case, you should start looking at the new Amazon Elastic File System (EFS), which is designed to be multiply attached in exactly the way you are wanting. The key difference between EFS and EBS is that they provide different abstractions: EFS exposes the NFSv4 protocol, whereas EBS provides raw block IO access.
Below you'll find my original explanation as to why it's not possible to safely mount a raw block device on multiple machines.
ORIGINAL POST (2011):
Even if you were able to get an EBS volume attached to more than one instance, it would be a _REALLY_BAD_IDEA_. To quote Kekoa, "this is like using a hard drive in two computers at once"
Why is this a bad idea? ... The reason you can't attach a volume to more than one instance is that EBS provides a "block storage" abstraction upon which customers run a filesystem like ext2/ext3/etc. Most of these filesystems (eg, ext2/3, FAT, NTFS, etc) are written assuming they have exclusive access to the block device. Two instances accessing the same filesystem would almost certainly end in tears and data corruption.
In other words, double mounting an EBS volume would only work if you were running a cluster filesystem that is designed to share a block device between multiple machines. Furthermore, even this wouldn't be enough. EBS would need to be tested for this scenario and to ensure that it provides the same consistency guarantees as other shared block device solutions ... ie, that blocks aren't cached at intermediate non-shared levels like the Dom0 kernel, Xen layer, and DomU kernel. And then there's the performance considerations of synchronizing blocks between multiple clients - most of the clustered filesystems are designed to work on high speed dedicated SANs, not a best-effort commodity ethernet. It sounds so simple, but what you are asking for is a very nontrivial thing.
Alternatively, see if your data sharing scenario can be NFS, SMB/CIFS, SimpleDB, or S3. These solutions all use higher layer protocols that are intended to share files without having a shared block device subsystem. Many times such a solution is actually more efficient.
In your case, you can still have a single MySql instance / fileserver that is accessed by multiple web front-ends. That fileserver could then store it's data on an EBS volume, allowing you to take nightly snapshot backups. If the instance running the fileserver is lost, you can detach the EBS volume and reattach it to a new fileserver instance and be back up and running in minutes.
"Is there anything like S3 as a filesystem?" - yes and no. Yes, there are 3rd party solutions like s3fs that work "ok", but under the hood they still have to make relatively expensive web service calls for each read / write. For a shared tools dir, works great. For the kind of clustered FS usage you see in the HPC world, not a chance. To do better, you'd need a new service that provides a binary connection-oriented protocol, like NFS. Offering such a multi-mounted filesystem with reasonable performance and behavior would be a GREAT feature add-on for EC2. I've long been an advocate for Amazon to build something like that.
Update (2020) It is now possible!
This is possible now with the newest instance types running in AWS Nitro within the same Availability Zone. There are some caveats but this is great for certain use cases that need the speed of EBS and where EFS isn't feasible.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volumes-multi.html
Original Post (2009)
No, this is like using a hard drive in two computers.
If you want shared data, you can setup a server that all your instances can access. If you are wanting a simple storage area for all your instances, you can use Amazon's S3 storage service to store data that is distributed and scalable.
Moving to the cloud, you can have the exact same setup, but you can possibly replace the fileserver with S3, or have all your instances connect to your fileserver.
You have a lot of options, but sharing a hard drive between instances is probably not the best option.
No, according to the EBS docs: "A volume can only be attached to one instance at a time".
How are you using the shared storage currently? If it's just for serving files from the fileserver, have you considered setting up a system so that you could proxy certain requests to a process on the fileserver rather than having the webservers serve those files?
Multiple Web Servers accessing MySQL Server & File Server is normal in AWS. Some of the best practices to be followed for the above mentioned architecture are:
Point 1) MySQL on EC2 can be setup as Master-Slave in async/semi sync mode in AWS. EBS-OPT+PIOPS in RAID 0 is recommended for High performance DB
Point 2) Alternatively you can use Amazon RDS + Multi-AZ mode. For read scaling Multiple RDS Read Replica's can be attached to MySQL RDS.
Point 3) EBS Volume cannot be attached to Multiple EC2's simultaneously. You can create File server based on GlusterFS on Amazon EC2 using EBS. Multiple Web Servers can talk to single GlusterFS simultaneously on AWS infra.
Point 4)In case your application can be integrated with S3 as file store, then it is preferred because of the stability it brings in to the architecture. You can also access S3 using tools like S3fuse as well from your application.
EBS just announced this is possible: https://aws.amazon.com/about-aws/whats-new/2020/02/ebs-multi-attach-available-provisioned-iops-ssd-volumes/
The short answer is a categorical "No". Others have said it above.
Those who said "yes" did not answer the question, but a different question. If EFS is just an NFS service, then it isn't the answer to the question as originally stated. And it doesn't matter if EFS is "rolled out in all zones" or not, because you can do your own NFS instance quite, and have multiple servers mount NFS. That isn't anything new, we've done that in 1992 already. SMB and sshfs, all of these are just ways to mount drives as a remote file system.
Those who said "why would you want to do that" or "it will all end in tears" are wrong. We have been mounting multiple disks to multiple servers for decades. If you ever worked with a SAN (Storage Area Network) the ability to attach the same device to multiple nodes usually through FibreChannel SAN is completely normal. So anyone who has run servers a decade ago before the virtualization / cloud servers became ubiquitous has some exposure to that.
Soon there were clustered file systems where two systems could read and write to the exact same volume. I believe this started with the VAX and Alpha VMS time in history already. Clustered file systems use a distributed mutual exclusion scheme to be able to manipulate blocks directly.
The advantage of mounting the same disk to multiple nodes is speed and reducing single points of failures.
Now, clustered file systems have not become hugely popular in the "consumer" hosting business, that is true. And they are complicated and have some pitfalls. But you don't even need a clustered file system to make use of a disk attached to multiple compute nodes. What if you want a read-only drive? You don't even need a clustered file system! You just put into your /etc/fstab the same physical device as read only (ro). Then you mount to 2 or 10 EC2 servers and all of them can read directly from that device!
There is an obvious use case for this in the world of cloud servers when building rapidly scaling farms. You can have your main system disk all prepared and use just a very small boot and configuration disk for each of the servers. You can even have all of them boot from the same boot disk, and right before the remount of / in read-write mode, you can insert a Union-FS with 3 layers:
- The main read-only system disk, with boot, kernel, and userland installation
- A configuration file system, with only the few files (mostly in /etc) that are specific to the individual server. This disk can be written out by another server to prepare for booting a new instance. Example files here would be /etc/hostname and just very few configuration files that need to remain different per node.
- The writable disk, which you may not need at all, could be just /tmp as a memory file system.
So, yes the question made a lot of sense, and unfortunately the answer is (still) "No". And no NFS is not a great replacement for that use case as it penalizes all read activity from the system disk. However, network boot from an NFS system disk is the only alternative to implementing the use case I described above. Unfortunately, since setting up network boot agent and NFS is much trickier than just accessing the same physical block device.
PS: I would have liked to submit a shorter version of this as comments, but I cannot because of the silly 51 credit points threshold, so I have to write an answer with the same essential "No" but to include my point why this is a relevant question that has not been receiving a deserved answer.
PPS: I just found someone over at StackExchange mention iSCSI. iSCSI is somewhat like NFS, but logically like a FibreChannel SAN. You get to access (and share) physical block devices. It would make boot disk sharing easier so you don't need to set up the bootd network booting which can be finicky. But then on AWS, there is no network booting available either.
Although EBS has previously only allowed a single EC2 instance to be attached to a given volume, multi-attach is now possible, at least for io1 volumes. For more information, see this AWS blog post.