Rook Ceph pods are not starting properly after Istio sidecar is enabled

Question

We are facing an issue with rook-ceph deployment in Kubernetes when the istio sidecar is enabled. The problem is that the OSDs are not coming up because the crashcollectors are not getting initialized properly. It is stuck as shown below.

rook-ceph        csi-cephfsplugin-7jcr9                             3/3     Running            0          63m
rook-ceph        csi-cephfsplugin-c4dnd                             3/3     Running            0          63m
rook-ceph        csi-cephfsplugin-provisioner-8658f67749-6gzkk      7/7     Running            2          63m
rook-ceph        csi-cephfsplugin-provisioner-8658f67749-bgdpx      7/7     Running            1          63m
rook-ceph        csi-cephfsplugin-zj9xm                             3/3     Running            0          63m
rook-ceph        csi-rbdplugin-58xf4                                3/3     Running            0          63m
rook-ceph        csi-rbdplugin-87rjn                                3/3     Running            0          63m
rook-ceph        csi-rbdplugin-provisioner-94f699d86-rh2r6          7/7     Running            1          63m
rook-ceph        csi-rbdplugin-provisioner-94f699d86-xkv6h          7/7     Running            1          63m
rook-ceph        csi-rbdplugin-tvjvz                                3/3     Running            0          63m
rook-ceph        rook-ceph-crashcollector-node1-f7f6c6f8d-lfs6d     0/2     Init:0/3           0          63m
rook-ceph        rook-ceph-crashcollector-node2-998bb8769-pspnn     0/2     Init:0/3           0          51m
rook-ceph        rook-ceph-crashcollector-node3-6c48c99c8-7bbl6     0/2     Init:0/3           0          40m
rook-ceph        rook-ceph-mon-a-7966994c76-z9phm                   2/2     Running            0          51m
rook-ceph        rook-ceph-mon-b-8cbf8579f-g6nd9                    2/2     Running            0          51m
rook-ceph        rook-ceph-mon-c-d65968cc4-wcpmr                    2/2     Running            0          40m
rook-ceph        rook-ceph-operator-5c47844cf-z9jcb                 2/2     Running            1          67m

When we do a kubectl describe on this pod, we get the following problems:

Warning  FailedMount  59m                  kubelet, node1  Unable to attach or mount volumes: unmounted volumes=[rook-ceph-crash-collector-keyring], unattached volumes=[rook-config-override rook-ceph-log rook-ceph-crash-collector-keyring istio-data istio-podinfo istiod-ca-cert istio-envoy rook-ceph-crash default-token-htvcq]: timed out waiting for the condition

Also noticed that the secret 'rook-ceph-crash-collector-keyring' is not getting created.

After a lots of debugging, noticed that the "mon" pods are not are not reachable through service endpoints. But all the other communications like Kubernetes APIs, other services in other namespaces etc are working just fine.

When we exec into the "mon" pod and do a curl, if we use the hostname it connects.

sh-4.4# curl -f rook-ceph-mon-b-8cbf8579f-g6nd9:6789
Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.

but using the service name doesn't work

sh-4.4# curl -f rook-ceph-mon-a:6789
curl: (56) Recv failure: Connection reset by peer

Also noticed in the rook-ceph-operator logs, there are potential clues for not getting the mons in quorum.

2021-02-13 06:11:23.532494 I | op-k8sutil: deployment "rook-ceph-mon-a" did not change, nothing to update
2021-02-13 06:11:23.532658 I | op-mon: waiting for mon quorum with [a c b]
2021-02-13 06:11:24.123965 I | op-mon: mons running: [a c b]
2021-02-13 06:11:44.354283 I | op-mon: mons running: [a c b]
2021-02-13 06:12:04.553052 I | op-mon: mons running: [a c b]
2021-02-13 06:12:24.760423 I | op-mon: mons running: [a c b]
2021-02-13 06:12:44.953344 I | op-mon: mons running: [a c b]
2021-02-13 06:13:05.153151 I | op-mon: mons running: [a c b]
2021-02-13 06:13:25.354678 I | op-mon: mons running: [a c b]
2021-02-13 06:13:45.551489 I | op-mon: mons running: [a c b]
2021-02-13 06:14:05.910343 I | op-mon: mons running: [a c b]
2021-02-13 06:14:26.188100 I | op-mon: mons running: [a c b]
2021-02-13 06:14:46.377549 I | op-mon: mons running: [a c b]
2021-02-13 06:15:06.563272 I | op-mon: mons running: [a c b]
2021-02-13 06:15:27.119178 I | op-mon: mons running: [a c b]
2021-02-13 06:15:47.372562 I | op-mon: mons running: [a c b]
2021-02-13 06:16:07.565653 I | op-mon: mons running: [a c b]
2021-02-13 06:16:27.751456 I | op-mon: mons running: [a c b]
2021-02-13 06:16:47.952091 I | op-mon: mons running: [a c b]
2021-02-13 06:17:08.168884 I | op-mon: mons running: [a c b]
2021-02-13 06:17:28.358448 I | op-mon: mons running: [a c b]
2021-02-13 06:17:48.559239 I | op-mon: mons running: [a c b]
2021-02-13 06:18:08.767715 I | op-mon: mons running: [a c b]
2021-02-13 06:18:28.987579 I | op-mon: mons running: [a c b]
2021-02-13 06:18:49.242784 I | op-mon: mons running: [a c b]
2021-02-13 06:19:09.456809 I | op-mon: mons running: [a c b]
2021-02-13 06:19:29.671632 I | op-mon: mons running: [a c b]
2021-02-13 06:19:49.871453 I | op-mon: mons running: [a c b]
2021-02-13 06:20:10.062897 I | op-mon: mons running: [a c b]
2021-02-13 06:20:30.258163 I | op-mon: mons running: [a c b]
2021-02-13 06:20:50.452097 I | op-mon: mons running: [a c b]
2021-02-13 06:21:10.655282 I | op-mon: mons running: [a c b]
2021-02-13 06:21:25.854570 E | ceph-cluster-controller: failed to reconcile. failed to reconcile cluster "rook-ceph": failed to configure local ceph cluster: failed to create cluster: failed to start ceph monitors: failed to start mon pods: failed to check mon quorum a: failed to wait for mon quorum: exceeded max retry count waiting for monitors to reach quorum

It looks like the mons are not reachable through the service endpoints anymore and that is making the whole process of initialization stuck.

below are the services running under rook-ceph namespace.

NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
csi-cephfsplugin-metrics   ClusterIP   10.233.30.235   <none>        8080/TCP,8081/TCP   83m
csi-rbdplugin-metrics      ClusterIP   10.233.61.8     <none>        8080/TCP,8081/TCP   83m
rook-ceph-mon-a            ClusterIP   10.233.2.224    <none>        6789/TCP,3300/TCP   83m
rook-ceph-mon-b            ClusterIP   10.233.39.129   <none>        6789/TCP,3300/TCP   72m
rook-ceph-mon-c            ClusterIP   10.233.51.59    <none>        6789/TCP,3300/TCP   61m

Other notes: Wee are using all the latest versions of istio, rook-ceph etc. Cluster is created using Kubespray, running on Ubuntu bionic with 3 nodes. Using Calico.

Please let us know if you need more details. Thanks in advance.

I think you could annotate your rook-ceph namespace to not be injected with istio sidecars. Would that be a good solution for you or do you require the istio sidecars for some reason on rook-ceph Pods? (with some exceptions like explained here istio.io/latest/docs/setup/additional-setup/sidecar-injection/…) — AndD
thanks for the reply...yes this can be done as a last option. Was trying to include all the pods in rook-ceph to get the benefit of istio MTLS. — Sunil Kpmbl
You said you use the last version of rook-ceph, which version to be precise? I saw some github issues where recent versions have problems with istio and it was fixed in even newer version, so I would check that as a start. — AndD
Kubernetes v1.19.7 Istio 1.9.0 ceph/ceph:v15.2.8 rook/ceph:v1.5.6 cephcsi:v3.2.0 these are the versions. I see rook has released a 1.5.7, will try also with that. Thanks for looking into this. — Sunil Kpmbl

Sunil Kpmbl Sunil Kpmbl · Accepted Answer · 2021-02-15T13:52:58

I have kind of narrowed down the issue to "rook-ceph-mon" pods. If we exclude sidecar injection on rook-ceph-mon and rook-ceph-osd-prepare pods(this should be ok since its a one time scheduled job) things are working fine.

In the Istio configuration I have added this to exclude mon and prepare pods from sidecar injection and it all worked after this.

neverInjectSelector: - matchExpressions: - {key: mon, operator: Exists} - matchExpressions: - {key: job-name, operator: Exists}

The other thing I had to do was to make the MTLS mode "PERMISSIVE" from "STRICT".

The listing will look like this now(note that no sidecar for mons)

rook-ceph        csi-cephfsplugin-444gk                             3/3     Running     0          16m
rook-ceph        csi-cephfsplugin-9cdkz                             3/3     Running     0          16m
rook-ceph        csi-cephfsplugin-n6k5x                             3/3     Running     0          16m
rook-ceph        csi-cephfsplugin-provisioner-8658f67749-ms985      7/7     Running     2          16m
rook-ceph        csi-cephfsplugin-provisioner-8658f67749-v2g8x      7/7     Running     2          16m
rook-ceph        csi-rbdplugin-lsfhl                                3/3     Running     0          16m
rook-ceph        csi-rbdplugin-mbf67                                3/3     Running     0          16m
rook-ceph        csi-rbdplugin-provisioner-94f699d86-5fvrf          7/7     Running     2          16m
rook-ceph        csi-rbdplugin-provisioner-94f699d86-zl7js          7/7     Running     2          16m
rook-ceph        csi-rbdplugin-swnvt                                3/3     Running     0          16m
rook-ceph        rook-ceph-crashcollector-node1-779c58d4c4-rx7jd    2/2     Running     0          9m20s
rook-ceph        rook-ceph-crashcollector-node2-998bb8769-h4dbx     2/2     Running     0          12m
rook-ceph        rook-ceph-crashcollector-node3-88695c488-gskgb     2/2     Running     0          9m34s
rook-ceph        rook-ceph-mds-myfs-a-6f94b9c496-276tw              2/2     Running     0          9m35s
rook-ceph        rook-ceph-mds-myfs-b-66977b55cb-rqvg9              2/2     Running     0          9m21s
rook-ceph        rook-ceph-mgr-a-7f478d8d67-b4nxv                   2/2     Running     1          12m
rook-ceph        rook-ceph-mon-a-57b6474f8f-65c9z                   1/1     Running     0          16m
rook-ceph        rook-ceph-mon-b-978f77998-9dqdg                    1/1     Running     0          15m
rook-ceph        rook-ceph-mon-c-756fbf5c66-thcjq                   1/1     Running     0          13m
rook-ceph        rook-ceph-operator-5c47844cf-gzms8                 2/2     Running     2          19m
rook-ceph        rook-ceph-osd-0-7d48c6b97d-t725c                   2/2     Running     0          12m
rook-ceph        rook-ceph-osd-1-54797bdd48-zgkrw                   2/2     Running     0          12m
rook-ceph        rook-ceph-osd-2-7898d6cc4-wc2c2                    2/2     Running     0          12m
rook-ceph        rook-ceph-osd-prepare-node1-mczj7                  0/1     Completed   0          12m
rook-ceph        rook-ceph-osd-prepare-node2-tzrk6                  0/1     Completed   0          12m
rook-ceph        rook-ceph-osd-prepare-node3-824lx                  0/1     Completed   0          12m

When sidecar is enabled on rook-ceph-mon something strange was happening and it was not reachable through service endpoints.

I know this is a bit of a workaround. Looking forward for a better answer.

Rook Ceph pods are not starting properly after Istio sidecar is enabled

2 Answers