I have an Elasticsearch cluster with 11 nodes. Five of these are data nodes and the other ones are client nodes from where I add and retrieve documents.
I am using the standard Elasticsearch configuration. Each index has 5 shards and replicas. In the cluster I have 55 indices and round about 150GB of data.
The cluster is very slow. With the Kopf plugin I can see the stats of each node. There I can see that one single data node (not the master) is permanently overloaded. Heap, disk, cpu are ok, but load is almost every time 100%. I have noticed, that every shard is a primary shard whereas all other data nodes have both primary shards and replicas. When I shutdown that node and then on again, the same problem occurs at another data node.
And I don't know why and how to solve this problem. I thought that the client nodes and the master node distribute the requests evenly? Why is always one data node overloaded?