In short, I am trying to determine a document's true document ID in method CustomScoreProvider.CustomScore
which only provides a document "ID" relative to a sub-IndexReader.
More info: I am trying to boost my documents' scores by precomputed boost factors (imagine an in-memory structure that maps Lucene's document ids to boost factors). Unfortunately I cannot store the boosts in the index for a couple of reasons: boosting will not be used for all queries, plus the boost factors can change regularly and that would trigger a lot of reindexing.
Instead I'd like to boost the score at query time and thus I've been working with CustomScoreQuery/CustomScoreProvider. The boosting takes place in method CustomScoreProvider.CustomScore:
public override float CustomScore(int doc, float subQueryScore, float valSrcScore) {
float baseScore = subQueryScore * valSrcScore; // the default computation
// boost -- THIS IS WHERE THE PROBLEM IS
float boostedScore = baseScore * MyBoostCache.GetBoostForDocId(doc);
return boostedScore;
}
My problem is with the doc
parameter passed to CustomScore. It is not the true document id -- it is relative to the subreader used for that index segment. (The MyBoostCache
class is my in-memory structure mapping Lucene's doc ids to boost factors.) If I knew the reader's docBase I could figure out the true id (id = doc + docBase
).
Any thoughts on how I can determine the true id, or perhaps there's a better way to accomplish what I'm doing?
(I am aware that the id I'm trying to get is subject to change and I've already taken steps to make sure the MyBoostCache
is always up to date with the latest ids.)