0
votes

Simple question for GCE users: are persistent boot disks safe to be used or data loss could occur?

I've seen that I can attach additional persistent disks, but what about the standard boot disks (that should be persistent as well) ? What happens during maintenance, equipment failures and so on ? Are these boot disks stored on hardware with built-in redundancy (raid and so on) ?

In other words, are a compute instance with persistent boot-disk similiar to a non-cloud VM stored on local RAID (from data-loss point of view) ? Usually cloud instances are volatile, a crash, shutdown, maintenance and so on, will destroy all data stored.

Obvisouly, i'll have backups.

2
Note: you can use a persistent disk created by any means (empty, then filled with data, from a snapshot, or from an image) to boot a VM. Creating from an image is the most common way of course. Disks created in this way should have the same durability characteristics as normal persistent disks. All persistent disks are stored with redundancy (like RAID) so hardware failures and maintenance events are invisible to users (modulo very rare events and bugs). Local SSD gives you access to a single physical SSD card on a VM and is not (guaranteed) persistent in the case of a crash.atomictom

2 Answers

0
votes

GCE Persistent Disks are designed to be durable and highly-available:

Persistent disks are durable network storage devices that your instances can access like physical disks in a desktop or a server. The data on each persistent disk is distributed across several physical disks. Compute Engine manages the physical disks and the data distribution to ensure redundancy and optimize performance for you.

(emphasis my own, source: Google documentation)

You have a choice of zonal or regional (currently in public beta) persistent disks, on an HDD or SSD-based platform. For boot disks, only zonal disks are supported as of the time of this writing.

As the name suggests, zonal disks are only guaranteed to persist their data within a single zone; outage or failure of that zone may render the data unavailable. Writes to regional disks are replicated to two zones in a region to safeguard against the outage of any one zone. The Google Compute Engine console, "Disks" section will show you that boot disks for your instances are zonal persistent disks.

Irrespective of the durability, it is obviously wise to keep your own backups of your persistent disks in another form of storage to safeguard other mechanisms for data loss, such as corruption in your application or user error by an operator. Snapshots of persistent disks are replicated to other regions; however, be aware of their lifecycle in the event the parent disk is deleted.

In addition to reviewing the comprehensive page linked above, I recommend reviewing the relevant SLA documentation to ascertain the precise guarantees and service levels offered to you.

Usually cloud instances are volatile, a crash, shutdown, maintenance and so on, will destroy all data stored.

The cloud model does indeed prefer instances which are stateless and can be replaced at will. This offers many scalability and robustness advantages, which can be achieved using managed instance groups, for example. However, you can use VMs for persistent storage if desired.

1
votes

normally the data boot disk should be ok with restart and other maintenance operation. But it will be deleted with the compute by default.

If you use managed-instance-group, preemptible compute... and you want persistent data, you should use another storage system. If you juste use compute as is, it should be safe enough with backup.

I still think an additional persistent disk or another storage system is a better way to do things. But it's only my opinion.