I am trying to use Velero as backup and disaster recovery tool in Google Cloud Platform with multiple GCP regions (for example: europe-north1 and europe-west4) for GKE private clusters. I was able to successfully backup and restore using velero in the same region (taking backup of gke cluster in europe-north1 and restoring to another gke cluster in europe-north1) without any issues. This works fine because the snapshots are stored in the same region (europe-north1) for both the clusters.
But I would like to use velero as the disaster recovery tool for the GKE clusters so that I can take backup of GKE clusters in europe-north1 region and restore them to europe-west4 region. On further research, I found that by enabling CSI plugin support for velero, I would be able to achieve the same. So I have folowed the guidelines to use CSI plugin with velero but I'm still not able to restore the persistent disk PVCs to another region. The snapshots are taken as multi-regional (for example, eu). But when I run the velero restore command, pod creation (I am using wordpress and mysql pods as examples) has been in 'pending' state.
kubectl describe pod (mysql and wordpress) gives the following error:
Normal NotTriggerScaleUp 72s (x31 over 6m12s) cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added):
Warning FailedScheduling 60s (x11 over 6m14s) default-scheduler 0/4 nodes are available: 4 node(s) had volume node affinity conflict.
This error is because the google persistent disks which are created by PVC are in a different region than the GKE Cluster. Checking the disks, I can see that the restore command created two disks but they are still created in europe-north1 region (primary gke cluster region) instead of being created in europe-west4 region where the secondary gke cluster resides.
As this is a new feature for velero (CSI Plugin), I couldn't find any documentation for using it in GCP (there is a document showing CSI implementation with Azure Disks)
Minimum requirement for CSI Plugin to work with velero backups:
kubernetes version : 1.17
velero version: 1.4.2
Velero Version
velero version
Client:
Version: v1.4.2
Git commit: 56a08a4d695d893f0863f697c2f926e27d70c0c5
Server:
Version: v1.4.2
GKE Cluster kubernetes version (GKE cluster created with GcePersistentDiskCsiDriver=ENABLED addon) :
v1.17.9-gke.600
Primary Region:
europe-north1
Secondary (DR) Region:
europe-west4
Command used to install velero server (with CSI plugin enabled):
velero install \
--features=EnableCSI \
--provider=gcp \
--image=gcr.io/$project/velero:v1.4.2 \
--plugins=gcr.io/$project/velero-plugin-for-gcp:v1.1.0,gcr.io/$project/velero-plugin-for-csi:v0.1.0 \
--bucket=$storagebucket \
--secret-file=$HOME/./velero-backup-storage-sa-key.json
Other documents that I have referred for this:
https://velero.io/docs/v1.4/csi/#installing-velero-with-csi-support
https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/gce-pd-csi-driver
Any help would be much appreciated.