3
votes

As I understand from the documentation, when a node goes down, elastic search automatically creates primary or replica shards on all other nodes to address the failure of the node.

However, what happens if the nodes comes up - will those shards are "automatically" removed created on other node as the machine comes up - say , in a cluster with 3 nodes , node 1 which contains 1 primary shard of index 1 , and one replica shard of index 2 . If node 1 goes down, the ES automatically creates the primary shard , replica shard in either of the available nodes

1

1 Answers

6
votes

It depends on the available nodes in the cluster and index configuration(shards and replicas).

Few important things to note

  1. Elasticsearch never allocates the primary shards and its replica on the same node.
  2. When a node containing primary shards goes down, and if there is another node containing its replica, then Elasticsearch just promotes that replica shard to primary shard(instant) and then see if there is another node in the cluster where it can copy the replica shard using fsync (copying the shards data) over the network.

This can cause below failure in the cluster:

  1. If Elasticsearch finds available nodes to create a replica it will create otherwise your cluster health will be turned Yellow(missing replica shards).

  2. If Elasticsearch can't promote any replica shard and a primary shard of an index can't be allocated, cluster health will be turned RED(missing primary shard).

When a node again joins the cluster

  1. Elasticsearch again tries to rebalance and recover and based on that cluster state and health will be updated.

Coming to your Example:- if you have just 1 primary shard and no replica of the index in your example, cluster state will be RED but when the node again joins the cluster it will again become Green.

But if you have 1 primary shard and 1 replica configuration, Elasticsearch will simply promote the replica shard on(node 2 or node 3) for index 1 and for index 2 whose replica shard was present on node 1, will be created on another node using its primary shard and cluster health will be in GREEN only and when node again joins, those shards will simply not be used.