54
votes

I created a PersistentVolume sourced from a Google Compute Engine persistent disk that I already formatted and provision with data. Kubernetes says the PersistentVolume is available.

kind: PersistentVolume
apiVersion: v1
metadata:
  name: models-1-0-0
  labels:
    name: models-1-0-0
spec:
  capacity:
    storage: 200Gi
  accessModes:
    - ReadOnlyMany
  gcePersistentDisk:
    pdName: models-1-0-0
    fsType: ext4
    readOnly: true

I then created a PersistentVolumeClaim so that I could attach this volume to multiple pods across multiple nodes. However, kubernetes indefinitely says it is in a pending state.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: models-1-0-0-claim
spec:
  accessModes:
    - ReadOnlyMany
  resources:
    requests:
      storage: 200Gi
  selector:
    matchLabels:
      name: models-1-0-0

Any insights? I feel there may be something wrong with the selector...

Is it even possible to preconfigure a persistent disk with data and have pods across multiple nodes all be able to read from it?

9

9 Answers

70
votes

I quickly realized that PersistentVolumeClaim defaults the storageClassName field to standard when not specified. However, when creating a PersistentVolume, storageClassName does not have a default, so the selector doesn't find a match.

The following worked for me:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: models-1-0-0
  labels:
    name: models-1-0-0
spec:
  capacity:
    storage: 200Gi
  storageClassName: standard
  accessModes:
    - ReadOnlyMany
  gcePersistentDisk:
    pdName: models-1-0-0
    fsType: ext4
    readOnly: true
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: models-1-0-0-claim
spec:
  accessModes:
    - ReadOnlyMany
  resources:
    requests:
      storage: 200Gi
  selector:
    matchLabels:
      name: models-1-0-0
16
votes

With dynamic provisioning, you shouldn't have to create PVs and PVCs separately. In Kubernetes 1.6+, there are default provisioners for GKE and some other cloud environments, which should let you just create a PVC and have it automatically provision a PV and an underlying Persistent Disk for you.

For more on dynamic provisioning, see:

https://kubernetes.io/blog/2017/03/dynamic-provisioning-and-storage-classes-kubernetes/

9
votes

If you're using Microk8s, you have to enable storage before you can start a PersistentVolumeClaim successfully.

Just do:

microk8s.enable storage

You'll need to delete your deployment and start again.

You may also need to manually delete the "pending" PersistentVolumeClaims because I found that uninstalling the Helm chart which created them didn't clear the PVCs out.

You can do this by first finding a list of names:

kubectl get pvc --all-namespaces

then deleting each name with:

kubectl delete pvc name1 name2 etc...

Once storage is enabled, reapplying your deployment should get things going.

6
votes

Had the same issue but it was another reason that's why I am sharing it here to help community.

If you have deleted PersistentVolumeClaim and then re-create it again with the same definition, it will be Pending forever, why?

persistentVolumeReclaimPolicy is Retain by default in PersistentVolume. In case we have deleted PersistentVolumeClaim, the PersistentVolume still exists and the volume is considered released. But it is not yet available for another claim because the previous claimant's data remains on the volume. so you need to manually reclaim the volume with the following steps:

  1. Delete the PersistentVolume (associated underlying storage asset/resource like EBS, GCE PD, Azure Disk, ...etc will NOT be deleted, still exists)

  2. (Optional) Manually clean up the data on the associated storage asset accordingly

  3. (Optional) Manually delete the associated storage asset (EBS, GCE PD, Azure Disk, ...etc)

If you still need the same data, you may skip cleaning and deleting associated storage asset (step 2 and 3 above), so just simply re-create a new PersistentVolume with same storage asset definition then you should be good to create PersistentVolumeClaim again.

One last thing to mention, Retain is not the only option for persistentVolumeReclaimPolicy, below are some other options that you may need to use or try based on use-case scenarios:

Recycle: performs a basic scrub on the volume (e.g., rm -rf //*) - makes it available again for a new claim. Only NFS and HostPath support recycling.

Delete: Associated storage asset such as AWS EBS, GCE PD, Azure Disk, or OpenStack Cinder...etc volume is deleted

For more information, please check kubernetes documentation.

Still need more clarification or have any questions, please don't hesitate to leave a comment and I will be more than happy to clarify and assist.

4
votes

I was facing the same problem, and realise that k8s actually does a just-in-time provision, i.e.

  • When a pvc is created, it stays in PENDING state, and no corresponding pv is created.
  • The pvc & pv (EBS volume) are created only after you have created a deployment which uses the pvc.

I am using EKS with kubernetes version 1.16 and the behaviour is controlled by StorageClass Volume Binding Mode.

1
votes

I've seen this behaviour in microk8s 1.14.1 when two PersistentVolumes have the same value for spec/hostPath/path, e.g.

kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv-name
  labels:
    type: local
    app: app
spec:
  storageClassName: standard
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/mnt/k8s-app-data"

It seems that microk8s is event-based (which isn't necessary on a one-node cluster) and throws away information about any failing operations resulting in unnecessary horrible feedback for almost all failures.

1
votes

I had this problem with helmchart of the apache airflow(stable), setting storageClass to azurefile helped. What you should do in such cases with the cloud providers? Just search for the storage classes that support the needed access mode. ReadWriteMany means that SIMULTANEOUSLY many processes will read and write to the storage. In this case(azure) it is azurefile.

path: /opt/airflow/logs

  ## configs for the logs PVC
  ##
  persistence:
    ## if a persistent volume is mounted at `logs.path`
    ##
    enabled: true

    ## the name of an existing PVC to use
    ##
    existingClaim: ""

    ## sub-path under `logs.persistence.existingClaim` to use
    ##
    subPath: ""

    ## the name of the StorageClass used by the PVC
    ##
    ## NOTE:
    ## - if set to "", then `PersistentVolumeClaim/spec.storageClassName` is omitted
    ## - if set to "-", then `PersistentVolumeClaim/spec.storageClassName` is set to ""
    ##
    storageClass: "azurefile"

    ## the access mode of the PVC
    ##
    ## WARNING:
    ## - must be: `ReadWriteMany`
    ##
    ## NOTE:
    ## - different StorageClass support different access modes:
    ##   https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes
    ##
    accessMode: ReadWriteMany

    ## the size of PVC to request
    ##
    size: 1Gi
-1
votes

I faced the same issue in which the PersistentVolumeClaim was in Pending Phase indefinitely, I tried providing the storageClassName as 'default' in PersistentVolume just like I did for PersistentVolumeClaim but it did not fix this issue.

I made one change in my persistentvolume.yml and moved the PersistentVolumeClaim config on top of the file and then PersistentVolume as the second config in the yml file. It has fixed that issue.

We need to make sure that PersistentVolumeClaim is created first and the PersistentVolume is created afterwards to resolve this 'Pending' phase issue.

I am posting this answer after testing it for a few times, hoping that it might help someone struggling with it.

-4
votes

Make sure your VM also has enough disk space.