I've got a 3 shard cluster consisting of the following shards:
- bp-rs0
- bp-rs1
- bp-rs3
I want to remove 1 shard; bp-rs3.
I executed db.adminCommand( { removeShard: "bp-rs3" } )
and got back what I would expect, the typical acknowledgment.
It said I needed to drop or movePrimary one database which I no longer needed, so I dropped it. I'm not sure if that has caused my problem which is:
For a few hours now, the draining message returned by running db.adminCommand( { removeShard: "bp-rs3" } )
has said exactly the following:
{
"msg" : "draining ongoing",
"state" : "ongoing",
"remaining" : {
"chunks" : 334,
"dbs" : 0
},
"note" : "you need to drop or movePrimary these databases",
"dbsToMove" : [ ],
"ok" : 1,
"operationTime" : Timestamp(1629235413, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1629235413, 2),
"signature" : {
"hash" : BinData(0,"IkfHFSkxh7gQheeWlXsI/tTjU1U="),
"keyId" : 6978594490403520515
}
}
}
Note the 334 remaining chunks. It hasn't changed for a long time.
This wouldn't be too much of an issue, but my most used collection is now un-queryable, which means the app it serves is unusable.
I get the following error when trying to query my only partitioned collection:
{
"message" : "Encountered non-retryable error during query :: caused by :: Could not find host matching read preference { mode: 'primary' } for set bp-rs1",
"ok" : 0,
"code" : 133,
"codeName" : "FailedToSatisfyReadPreference",
"operationTime" : "Timestamp(1629232940, 1)",
"$clusterTime" : {
"clusterTime" : "Timestamp(1629232944, 2)",
"signature" : {
"hash" : "IlYQ/HU+EWYsm8CL2xtCziX6xtY=",
"keyId" : "6978594490403520515"
}
},
"name" : "MongoError"
}
I don't know why bp-rs1 would be affected at all. bp-rs0 is the primary.
sh.status
returns the following:
--- Sharding Status ---
sharding version: {
"_id" : NumberInt(1),
"minCompatibleVersion" : NumberInt(5),
"currentVersion" : NumberInt(6),
"clusterId" : ObjectId("602d2def7771e35f1961e454")
}
shards:
{ "_id" : "bp-rs0", "host" : "bp-rs0/xxx:27020,xxx:27020", "state" : NumberInt(1) }
{ "_id" : "bp-rs1", "host" : "bp-rs1/xxx:27020", "state" : NumberInt(1) }
{ "_id" : "bp-rs3", "host" : "bp-rs3/xxx:27020", "state" : NumberInt(1), "draining" : true }
active mongoses:
"4.0.3" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: yes
Failed balancer rounds in last 5 attempts: 5
Last reported error: Could not find host matching read preference { mode: "primary" } for set bp-rs1
Time of Reported error: Tue Aug 17 2021 23:09:45 GMT+0100 (British Summer Time)
Migration Results for the last 24 hours:
241 : Success
1 : Failed with error 'aborted', from bp-rs3 to bp-rs1
databases:
{ "_id" : "xxx", "primary" : "bp-rs0", "partitioned" : true, "version" : { "uuid" : UUID("c6301dba-1f34-4043-be6f-1e99dc9a8fb9"), "lastMod" : NumberInt(1) } }
xxx.listings
shard key: { "meta.canonical" : 1 }
unique: false
balancing: true
chunks:
bp-rs0 696
bp-rs1 695
bp-rs3 334
too many chunks to print, use verbose if you want to force print
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : NumberInt(1) }
unique: false
balancing: true
chunks:
bp-rs0 1
{ "_id" : MinKey } -->> { "_id" : MaxKey } on : bp-rs0 Timestamp(1, 0)
Is there something I can do? To either rollback and start again, or just to make everything work as it should?
Thanks in advance