6
votes

In my project we use Lucene 2.4.1 for fulltext search. This is a J2EE project, IndexSearcher is created once. In the background, the index is refreshed every couple of minutes (when the content changes). Users can search the index through a search mechanism on the page.

The problem is, the results returned by Lucene seem to be cached somehow.

This is scenario I noticed:

  • I start the application and search for 'keyword' - 6 results are returned,
  • Index is refreshed, using Luke I see, that there are 8 results now to query 'keyword',
  • I search again using the application, again 6 results are returned.

I analyzed our configuration and haven't found any caching anywhere. I have debugged the search, and there is no caching in out code, searcher.search returnes 6 results.

Does Lucene cache results internally somehow? What properties etc. should I check?

3

3 Answers

10
votes

To see changes made by IndexWriters against an index for which you have an open IndexReader, be sure to call IndexReader.reopen() to see the latest changes.

Make sure also that your IndexWriter is committing the changes, either through an explicit commit(), a close(), or having autoCommit set to true.

1
votes

With versions prior to 2.9.0, Lucene cached automatically the results of queries. With later releases there's no caching unless you wrap your query in a QueryFilter and then wrap the result in a CachingWrapperFilter. You could consider switching to a release >= 2.9.0 if reopening the index becomes a problem

1
votes

One more note: In order to IndexReader find the real-time other threads updated documents, when initialize IndexReader, the parameter "read-only" has to be false. Otherwise, method reopen() will not work.