2
votes

We have java application with embedded Elasticsearch in a cluster of 14 nodes. All the data resides in a central database, and they are indexed in elasticsearch for querying. A full reindex can be done at any time.

The system are very query-heavy, the amount of writes are small. The number of documents will not be higher than, say, 300.000. The size of each document varies greatly, from just a couple of ids, to extracted text from e.g word-documents of several pages.

I want to make sure that in case of a total breakdown, it should be sufficient that one or two nodes are available for the system to work.

Write consistency should not be a problem since the master copy of the data is in the database, and it seems that ES is capable of resolving conflicting data by using the newest version (which should be all right in our case)

My first though is to use 1 shard, and 13 replicas. This will naturally ensure that all nodes have access to all data. This could also be accomplished by having 2 shards / 13 replicas, so this yield that to ensure that all data is available, the number of replicas should be the number of nodes - 1, not depending on the number of shards (which could be anything).

If the requirement of number of nodes are reduced to "2 nodes should be up at any time", then a shards / replica distribution of "x/number of nodes - 2" should be sufficient.

So, for the question:

Asserting the above setup and that my thoughts is correct, would a setup with 1 shard / 13 replicas make sense or would there be anything to gain by adding more shards and run e.g a 4 shards/13 replicas setup?

2

2 Answers

0
votes

After a good bit of research and talking to ES-gurus;

As long as the shard size is small enough, the most efficient way of setting up this cluster would indeed be 1 shard only, with 13 replicas. I have not been able to pinpoint the threshold size of the shard for this starting to perform worse.

0
votes

If the index is big... you will need more than one shard (if you want perfomance). Do You really need 13 replica? When you put only 2 replicas, ES manage that to keep it that way, if the principal node fail, ES will create a new reply. May be you will need a balancer node too.