Here's the problem I'm trying to solve:
- I have multiple lucene indices, each containing a subset of the same data structure (they have the same fields, but the fields may or may not be present in a document in a certain index)
- There is a global identifier that is shared between indices. Meaning, if there are 4 indices, there may be up to 4 documents sharing a single key.
- I have a single lucene query
I query all indices together using a MultiReader
and I am able to find out which sub-index the hit is coming from using ReaderUtil
. So far so good, but here's the problem:
In order to perform a (rather complex) merging logic, i need the documents from all subindices with any key that matched at least one document in the original query.
Here's an example:
Index 1
1: {key: "foo", name: "Name A", something: 42}
2: {key: "bar", something: 2}
Index 2
27: {key: "foo", something: 2}
Index 3
102: {key: "foo", name: "Name B"}
103: {key: "bar", something: 999}
Now, if I would perform a query for name "Name A"
, I would only get document 1 from index 1.
What I actually need are all documents from all indices with keys that were hit in that query, which are all document with key foo
:
- doc 1 from index 1
- doc 27 from index 2
- doc 102 from index 3
based on the original query for name: "Name A"
.
Can I achieve this without 2 separate queries, the second being a massive OR
based on the keys retrieved in the first? Is there a more efficient way?