With Lucene, what would be the recommended approach for locating matches in search results?
More specifically, suppose index documents have a field "fullText" which stores the plain-text content of some document. Furthermore, assume that for one of these documents the content is "The quick brown fox jumps over the lazy dog". Next a search is performed for "fox dog". Obviously, the document would be a hit.
In this scenario, can Lucene be used to provide something like the matching regions for found document? So for this scenario I would like to produce something like:
[{match: "fox", startIndex: 10, length: 3},
{match: "dog", startIndex: 34, length: 3}]
I suspect that it could be implemented by what's provided in the org.apache.lucene.search.highlight package. I'm not sure about the overall approach though...