2
votes

In order to provide write scaling I decided to convert standalone mongoDB to sharded cluster. The issue is I can't find shard key with high degree of randomness to ensure the cluster's write operations are distributed evenly and also I can't use hashed shard key as long as there is compound unique indexes for all collections.

I see 2 solutions:

  1. To use prefix of unique index as shard key. But even ignoring poor distribution it is not acceptable cause somehow all chunks are placed on primary shard and afterwards balancer distributes them across shards. When I use hashed shard key negative numbers are placed on the 1st shard and positive numbers are placed on the 2nd one. How can I enforce mongoDB to distribute chunks across both of shards using range distribution?

  2. To use tag aware sharding. But I can't predict data for next months, so my tags in future can be distributed uneven. I suppose there is no cheap solution for automation data tagging.

Do you know any sharding solutions for collections with compound unique key ?

1
Is temporal locality important to you? If so, use the cheap option and go with time stamp.wulfgarpro
I'm not sure if I have got it. Do you mean using of timestamp column as tag? Unfortunately our "timestamp" is hour trunc of true timestamp. whether it will lead to the "hot shard"?versificator
This is a good write up.wulfgarpro

1 Answers