0
votes

I am in process of creating an Elasticsearch cluster where in major emphasis will be given on search.The initial throughput for search is expected to be 6000 request per second. Currently I am having following configurations :-

  • Master Nodes - 7 Nodes (8 CPU, 16 GB RAM)
  • Data Nodes - 12 Nodes( 16 CPU, 32 GB RAM)
  • Co-ordinator Nodes - 4 Nodes( 16 CPU, 32 GB RAM)

With this setup when I try to do a load testing with Jmeter and sample query being a query used for our testing max I am able to reach is 58 reuqest per second. The target being avg response time not to go more than 1.5 sec. As far as the query is concerned the query contains multimatch query with function scoring being used to provide custom scoring. I have almost 20 indexes on which I am performing the search, the data is not that huge in each indexes, the count will be in hundreds only. Because of this reason all indexes are having 1 shard and 2 replicas. Any thoughts/help on how can I increase the throughput is most welcome.

Thanks

1

1 Answers

0
votes

That's a complex questions with many different things you can tweak, but here are some ideas for starters:

  1. Why so many master nodes? 7 means you can survive the loss of up to 3 of those — is that a scenario you need to cover? It might make more sense to add more data nodes instead.
  2. Do you need the coordinating nodes? Again it might make more sense to have more power on the data nodes instead.
  3. "I have almost 20 indexes on which I am performing the search, the data is not that huge in each indexes, the count will be in hundreds only." How much data do you have in each shard? A shard should be in the GB range. You might benefit from fewer shards (potentially have a field you can filter on to keep some distinction) and increase the number of replicas.