Let say I have 5 data nodes. Then I save a Person
document. Now how couple of questions:
How can I find which node is the saved document saved to?
After saving one
Person
document to a node with two replicas how can I query for thisPerson
and get info which replica/node does the resulting answer comes from?How can I check how fast the document is available in two replicas of a node?
EDIT
The use case is as follows: In general how to assure consistency in case when a primary shard have new data written but the data has not yet been synchronised with a replica. At the same time the replica is being queried for the new data that is present only at the primary shard at the time of querying the replica. Pretty much I wonder about DETAILS of consistency in situation as described in last paragraph of the distributed read documentation ===> but on the other hand here the doc says about query phase that each primary and replica are queried and build priority queues that are later merged, thus the result form primary shard would be included in merged queue based at the globally sorted result set build out of all priority queues at the coordinating node.
- Question X So is the exclusive doc from primary shard returned at search or not in case it is not being replicated to remaining replicas?
In other words.
I want to assure data consistency across my distributed ES cluster. Now I want to test if the following situation can take place. Lets say I have one cluster with 5 nodes and the data are put only to one node (e.g. node2
with primary shard). Before the data have time to replicate to remaining replicas I got query for this new data towards node3
which in theory should have the replica of the data, but didn't get it yet after the node2
got changed. So in this case query committed towards the node3
requesting the new data would have not return the new data even though they have been put to 'node2'.
- Question A) If this might happen how can I control the replication phases/state so that I can tell if the replication is complete?
- Question B) How can I tell if the replica is consistent with the primary shard or not, and in what state it is (replica's data is consistent or incosistent with primary shard)?
- Question C) If I can't control this replication flow and data
consistency how can I eliminate potential inconsistencies for query
committed toward
node3
? - Question D) How can I observe the behaviour of adding a doc to primary shard, and not having it stored at the replica shard (e.g. can I slow down / customize the time of replication or can I test this behaviour some other way)?
node1
is not yet atnode2
butnode2
is being queried for the document? – mCs