0
votes

I have a Solr cloud (version 7.4) with 2 nodes (each one has 10Gb SSD hard and 256Gb memory and 50Gb heap) and 10 collections.

one collection has 12 billion documents and rest of collections has 1 billion documents.

we don’t know exactly how many shards are proper for my use case.

How can I determine the appropriate number of shaders and How many Shard is appropriate for each collection?

Does it have a formula?

1
No. Especially not with a non-trivial corpus like the one you are outlining: lucidworks.com/2012/07/23/…Toke Eskildsen

1 Answers

1
votes

Shards should be located on different hardware for optimal performance (that's why you shard). Right now with 2 nodes you pretty much should pick 2 shards. But for the performance you are after you might need more nodes and add more shards.

Naturally performance is not just a factor of the number of shards / nodes but also how much memory each node has (heap and off-heap), CPU, you read/write mix, network speed, disk I/O speed etc. not to mention autoCommit / autoSoftCommit settings compared to the size of the index and your expected load.