3
votes

We have attempted to shard a large collection in mongodb 2.4.9 across 3 replica sets (rs1, rs2, rs3). At present, all data resides on rs1.

We have 3 config servers running and enabled sharding using:

sh.enableSharding("test")

We then selected a shard key and sharded a collection:

sh.shardCollection("test.fs.chunks", { files_id : 1 , n : 1 } )   

After that we added our additional shards:

sh.addShard( "rs2/mongo2:27017" )

sh.addShard( "rs3/mongo3:27017" )

However - after 4 days, all data still resides on rs1. Looking at the configuration, the database we are sharding is listed as "partitioned = true":

{  "_id" : "test",  "partitioned" : true,  "primary" : "rs1" }

However, when we execute db.fs.chunks.getShardDistribution() we are presented with an error stating that the collection is not sharded:

mongos> db.fs.chunks.getShardDistribution()
Collection test.fs.chunks is not sharded.

We then tried to re-execute the shardCollection command and receive an error stating that it is already sharded:

mongos> sh.shardCollection("test.fs.chunks", { files_id : 1 , n : 1 } )

"code" : 13449,

"ok" : 0,

"errmsg" : "exception: collection test.fs.chunks already sharded with 33463 chunks"

All 3 config servers are operational. The mongos logs contain a series of balancer distributed lock acquired / unlocked messages but nothing else noteworthy.

Does anyone have any advice on how we can troubleshoot this further and get some sharding happening?

Thanks

Dave

1
Did you create an index on files_id and n (the shard key) before sharding the collection? Also, you're not describing that you're using mongos...?Joachim Isaksson
We did have indexes created on "files_id" and "n" prior to sharding the collection. To clarify further - our mongo environment consists of 3 replica sets, 3 mongos instances (our driver only points to one of these at the moment) and 3 config servers.user3345274
this can happen when something happens during the process of sharding the collection (network hiccup, some other error) and it leaves the config is a state where the collection is not "fully" sharded. You can check in the config DB, use config; db.collections.find({ns:"test.fs.chunks"}) and db.chunks.find({ns:"test.fs.chunks"}) - my guess is that there are no chunks even though there is an entry in collections. You can clean this up manually but i would recommend checking the logs for mongos at the time the collection was sharded and see what errors or warnings you may find.Asya Kamsky
@AsyaKamsky that sounds likely - there appear to have been issues with the config servers when the collection was initially sharded. This was resolved shortly after, but the collection is now stuck in limbo. db.collections.find({ns:"test.fs.chunks"}) returns no results, db.chunks.find({ns:"test.fs.chunks"}) returns approx. 20 results. Do you have any recommendations on how this could be cleaned up manually? I would prefer that mongo does the splitting rather than investigating manually splitting the collection. Thanksuser3345274
Sorry the first one should have been db.collections.find({_id:"test.fs.chunks"}) - still nothing back?Asya Kamsky

1 Answers

1
votes

I had a similar problem with a collection but I fixed it using this command:

http://docs.mongodb.org/manual/reference/command/splitChunk/

I'm 100% certain this isn't what you're supposed to do, but it did work!

Actually another idea would be to create a new collection with only one record in, shard it and then insert all the records from the other collection into it.

I had a collection with every single records in one chunk. Used sh.status() to find out which chunks were the largest.

Then used:

db.adminCommand({split:<database>.<collection>,find:{<database>.<collection>._id:<any doc in the shard>}});

This split the chunks at the midpoint. Interestingly, the chunking process of Mongodb then did some further splitter, but still needed some overriding to get the chunks into the right sizes.