I am migrating my code from Lucene 3.5 to Lucene 4.1 but I am having some problems with getting the term vector without indexing.
The problem is, given a text string and an Analyzer
, I need to compute the term vector (technically, find the terms and their frequencies tf). Obviously, it can be achieved by writing the index (using IndexWriter
) and then reading them back (using IndexReader
) but I reckon it would be expensive. Furthermore, I don't need document frequency (df). Thus, I think an indexing-free solution is suitable.
In Lucene 2 and 3, a simple technique for the above purpose is to use QueryTermVector
which extends TermFreqVector
and has a constructor taking a string and an Analyzer. Unfortunately, QueryTermVector
(along with TermFreqVector
) has been removed in Lucene 4 and it seems the migration documentation did not mention anything about QueryTermVector
.
Do you have a solution for this problem in Lucene 4? Thank you very much.