Lets say I have elasticsearch in 3 different environments. We'll call them integration, staging, production. They all have the same settings (default 5 shards and 1 replica) no differences in elasticsearch settings and the same data indexed in all 3 places. Is it true that if I do the same search against each environment, the returned results across all 3 environments would vary (not widely but in the relevance scoring) because of document shard distribution?
2 Answers
Short answer is yes. However, DFS Query then Fetch is How I've resolved this in the past.
DFS Query then Fetch vs Query then Fetch
Basically DFS Query does a pre-calculation and should give more reproducible results. Given these are different environments though, it may not be worth the extra performance hit in the production environment. Personally, the hit has been nominal even in very large cases.
If you use search type dfs_query_then_fetch the absolute relevance score of document should not vary but the ordering of results for documents with equivalent scores are not guaranteed to be same .
This difference in ordering can occur between iterations of the query for a given es instance. This can be alleviated to an extent using preference option