Shards / Replicas settings for high availability

Question

We have java application with embedded Elasticsearch in a cluster of 14 nodes. All the data resides in a central database, and they are indexed in elasticsearch for querying. A full reindex can be done at any time.

The system are very query-heavy, the amount of writes are small. The number of documents will not be higher than, say, 300.000. The size of each document varies greatly, from just a couple of ids, to extracted text from e.g word-documents of several pages.

I want to make sure that in case of a total breakdown, it should be sufficient that one or two nodes are available for the system to work.

Write consistency should not be a problem since the master copy of the data is in the database, and it seems that ES is capable of resolving conflicting data by using the newest version (which should be all right in our case)

My first though is to use 1 shard, and 13 replicas. This will naturally ensure that all nodes have access to all data. This could also be accomplished by having 2 shards / 13 replicas, so this yield that to ensure that all data is available, the number of replicas should be the number of nodes - 1, not depending on the number of shards (which could be anything).

If the requirement of number of nodes are reduced to "2 nodes should be up at any time", then a shards / replica distribution of "x/number of nodes - 2" should be sufficient.

So, for the question:

Asserting the above setup and that my thoughts is correct, would a setup with 1 shard / 13 replicas make sense or would there be anything to gain by adding more shards and run e.g a 4 shards/13 replicas setup?

runarM runarM · Accepted Answer · 2013-10-29T11:37:51

After a good bit of research and talking to ES-gurus;

As long as the shard size is small enough, the most efficient way of setting up this cluster would indeed be 1 shard only, with 13 replicas. I have not been able to pinpoint the threshold size of the shard for this starting to perform worse.

Shards / Replicas settings for high availability

2 Answers