tl;dr:
What's the best way to bulk-fetch documents from Lucene using an exact-match on a set of keys?
Long version:
We have a Lucene index persisted to disk that is read through a DirectoryReader
.
It contains 2,000,000 documents with the schema:
{"key": "20-character-string", "value": "1-1000-character-string"}
We now need to perform the equivalent of a SELECT document WHERE document.key IN $keyArray
-- i.e. return the subset of documents whose keys intersect the $keyArray
(a 10,000-item array of keys) using an exact-match.
Is there a better way than performing 10,000 separate searches?
TermInSetQuery
is what I'm after. - Lawrence Wagerfield