I have an Elasticsearch (v5.6.10) cluster with 3 nodes.
- Node A : Master
- Node B : Master + Data
- Node C : Master + Data
There are 6 shards per data node with replication set as 1. All 6 primary nodes are in Node B and all 6 replicas are in Node C.
My requirement is to take out the Node B, do some maintenance work and put it back in cluster without any downtime.
I checked elastic documentation, discussion forum and stackoverflow questions. I found that I should execute the below request first in order to to allocate the shards on that node to the remaining nodes.
curl -XPUT localhost:9200/_cluster/settings -H 'Content-Type: application/json' -d '{
"transient" :{
"cluster.routing.allocation.exclude._ip" : <Node B IP>
}
}';echo
Once all the shards have been reallocated I can shutdown the node and do my maintenance work. Once I am done, I have to include the node again for allocation and Elasticsearch will rebalance the shards again.
Now I also found another discussion where the user faced an issue of yellow/red cluster health issue due to having only one data node but wrongly setting replication as one, causing unassigned shards. It seems to me while doing this exercise I am taking my cluster towards that state.
So, my concern here is whether I am following the correct way keeping in mind all my primary shards are in the node (Node B) which I am taking out of a cluster having replication factor as 1.