Recover from failed Ceph Cluster - Inactive PGs (Down)

Question

Ceph Cluster PGs inactive/down.

I had a healthy cluster and tried adding a new node using ceph-deploy tool. I didn't put enable noout flag before adding node to cluster.

So while using ceph-deploy tool, I ended up deleting new OSD nodes couple of times and it looks like Ceph tries to balance PGs and now those PGs are inactive/down state.

I tried recovering one PG just to see if it recover but that's not the case. I am using ceph to manage OpenStack glance images and VMs. So now all new VMs and existing VMs are slow or not responding.

Current Output of Ceph tree: (Note fre201 is new node. I have recently disabled OSD services on that node)

[root@fre201 ceph]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 70.92137 root default -2 5.45549 host fre101 0 hdd 1.81850 osd.0 up 1.00000 1.00000 1 hdd 1.81850 osd.1 up 1.00000 1.00000 2 hdd 1.81850 osd.2 up 1.00000 1.00000 -9 5.45549 host fre103 3 hdd 1.81850 osd.3 up 1.00000 1.00000 4 hdd 1.81850 osd.4 up 1.00000 1.00000 5 hdd 1.81850 osd.5 up 1.00000 1.00000 -3 5.45549 host fre105 6 hdd 1.81850 osd.6 up 1.00000 1.00000 7 hdd 1.81850 osd.7 up 1.00000 1.00000 8 hdd 1.81850 osd.8 up 1.00000 1.00000 -4 5.45549 host fre107 9 hdd 1.81850 osd.9 up 1.00000 1.00000 10 hdd 1.81850 osd.10 up 1.00000 1.00000 11 hdd 1.81850 osd.11 up 1.00000 1.00000 -5 5.45549 host fre109 12 hdd 1.81850 osd.12 up 1.00000 1.00000 13 hdd 1.81850 osd.13 up 1.00000 1.00000 14 hdd 1.81850 osd.14 up 1.00000 1.00000 -6 5.45549 host fre111 15 hdd 1.81850 osd.15 up 1.00000 1.00000 16 hdd 1.81850 osd.16 up 1.00000 1.00000 17 hdd 1.81850 osd.17 up 0.79999 1.00000 -7 5.45549 host fre113 18 hdd 1.81850 osd.18 up 1.00000 1.00000 19 hdd 1.81850 osd.19 up 1.00000 1.00000 20 hdd 1.81850 osd.20 up 1.00000 1.00000 -8 5.45549 host fre115 21 hdd 1.81850 osd.21 up 1.00000 1.00000 22 hdd 1.81850 osd.22 up 1.00000 1.00000 23 hdd 1.81850 osd.23 up 1.00000 1.00000 -10 5.45549 host fre117 24 hdd 1.81850 osd.24 up 1.00000 1.00000 25 hdd 1.81850 osd.25 up 1.00000 1.00000 26 hdd 1.81850 osd.26 up 1.00000 1.00000 -11 5.45549 host fre119 27 hdd 1.81850 osd.27 up 1.00000 1.00000 28 hdd 1.81850 osd.28 up 1.00000 1.00000 29 hdd 1.81850 osd.29 up 1.00000 1.00000 -12 5.45549 host fre121 30 hdd 1.81850 osd.30 up 1.00000 1.00000 31 hdd 1.81850 osd.31 up 1.00000 1.00000 32 hdd 1.81850 osd.32 up 1.00000 1.00000 -13 5.45549 host fre123 33 hdd 1.81850 osd.33 up 1.00000 1.00000 34 hdd 1.81850 osd.34 up 1.00000 1.00000 35 hdd 1.81850 osd.35 up 1.00000 1.00000 -27 5.45549 host fre201 36 hdd 1.81850 osd.36 down 0 1.00000 37 hdd 1.81850 osd.37 down 0 1.00000 38 hdd 1.81850 osd.38 down 0 1.00000

Current Ceph Health:

Current Health of Ceph cluster

~ceph -s cluster: id: XXXXXXXXXXXXXXXX health: HEALTH_ERR 3 pools have many more objects per pg than average 358887/12390692 objects misplaced (2.896%) 2 scrub errors 9677 PGs pending on creation Reduced data availability: 7125 pgs inactive, 6185 pgs down, 2 pgs peering, 2709 pgs stale Possible data damage: 2 pgs inconsistent Degraded data redundancy: 193505/12390692 objects degraded (1.562%), 351 pgs degraded, 1303 pgs undersized 53882 slow requests are blocked > 32 sec 4082 stuck requests are blocked > 4096 sec too many PGs per OSD (2969 > max 200) services: mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03 mgr: ceph-mon03(active), standbys: ceph-mon01, ceph-mon02 osd: 39 osds: 36 up, 36 in; 51 remapped pgs rgw: 1 daemon active data: pools: 18 pools, 54656 pgs objects: 6050k objects, 10940 GB usage: 21721 GB used, 45314 GB / 67035 GB avail pgs: 13.036% pgs not active 193505/12390692 objects degraded (1.562%) 358887/12390692 objects misplaced (2.896%) 46177 active+clean 5070 down 1114 stale+down 1088 stale+active+undersized 547 activating 201 stale+active+undersized+degraded 173 stale+activating 96 activating+degraded 61 stale+active+clean 43 activating+remapped 39 stale+activating+degraded 24 stale+activating+remapped 9 activating+undersized+degraded+remapped 4 stale+activating+undersized+degraded+remapped 2 active+clean+inconsistent 1 stale+activating+degraded+remapped 1 stale+remapped+peering 1 active+undersized 1 stale+peering 1 stale+active+clean+remapped 1 down+remapped 1 stale+remapped 1 activating+degraded+remapped io: client: 967 kB/s rd, 1225 kB/s wr, 29 op/s rd, 30 op/s wr

I am not sure how to recover 7125 PGs which are present on active OSDs. Any help would be appreciated.

iops are also almost 0. Probably that's why its not recovering. — Arun POONIA

Arun POONIA Arun POONIA · Accepted Answer · 2019-01-21T20:22:58

In luminous release of ceph. Release is enforcing maximum number of PGs as 200. In my case they were more than 3000+ so I need to set max_number_of pgs parameter in /etc/ceph/ceph.conf file of monitor and OSDs as 5000 which enabled ceph recovery.

Recover from failed Ceph Cluster - Inactive PGs (Down)

Ceph Cluster PGs inactive/down.

Current Ceph Health:

1 Answers