3
votes

I have a Kubernetes v1.17.0 cluster with multiple nodes. I've created PVC with access mode set to RWO. From the Kubernetes docs:

ReadWriteOnce -- the volume can be mounted as read-write by a single node

I'm using a Cinder volume plugin which doesn't support ReadWriteMany.

When I create two different deployments that mount the same PVC Kubernetes sometimes deploys them on two different nodes which cause pods to fail.

Is this desired behaviour or is there a problem in my configuration?

3
what is the error you get while deploying the pod?Arghya Sadhu
Multi-Attach error for volume "[pv volume name]" Volume is already used by pod(s) [list of pods]Lukas
Did you considered use nodeAffinity to deploy your pods only in the desired node that is able to mount the volume?Mr.KoopaKiller
I'm currently using affinity rules otherwise both deployments would fail. I would rather have Kubernetes decide which is the best node for both deployments.Lukas
@Lukas, If I uderstood, you are already using Affinity rules, but your wish is remove the affinity rule and leave Kubernetes decide where to run the pods?Mr.KoopaKiller

3 Answers

4
votes

As I gathered from your answers to the comments, you do not want to use affinity rules but want the scheduler to perform this work for you.

It seems that this issue has been known since at least 2016 but has not yet been resolved, as the scheduling is considered to be working as expected: https://github.com/kubernetes/kubernetes/issues/26567

You can read the details in the issue, but the core problem seems to be that in the definition of Kubernetes, a ReadWriteOnce volume can never be accessed by two Pods at the same time. By definition. What would need to be implemented is a flag saying "it is OK for this RWO volume to be accessed by two Pods at the same time, even though it is RWO". But this functionality has not been implemented yet.

In practice, you can typically work around this issue by using a Recreate Deployment Strategy: .spec.strategy.type: Recreate. Alternatively, use the affinity rules as described by the other answers.

1
votes

The provisioning of PV/PVC and deployment of new pods, on the same node can only be achieved via node affinity. However, if you want Kubernetes to decide it for you will have to use inter-pod affinity.

However just to verify if you are doing everything the right way please refer this.

0
votes

Persistent volumes in Kubernetes can be tied to a node or an availability zone because of the underlying hardware: A storage drive within a server, a SAN within a single datacenter cannot be moved around by the storage provisioner.

Now how does the storage provisioner know on which node or in which availability zone he needs to create the persistent volume? That's why persistent volume claims have a volume binding mode, which is set to WaitForFirstConsumer in that case. This means, the provisioning happens after the first pod that mounts the persistent volume has been scheduled. For more details, read here.

When a second pod is scheduled, it might run on another node or another availability zone unless you tell the scheduler to run the pod on the same node or in the same availability zone as the first pod by using inter-pod affinity:

    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          # adjust the labels so that they identify your pod
          matchExpressions:
          - key: app.kubernetes.io/name
            operator: In
            values:
            - myapp
        # make pod run on the same node
        topologyKey: kubernetes.io/hostname