We are using Lucene 2.9.2 (upgrade to 3.x is planned) and it is a known fact that the search queries become slower over time. Usually we perform a full reindex. I have read the question https://stackoverflow.com/a/668453/356815 and its answers and to answer it right now: we do NOT use optimize() because performance was not acceptable anymore when running it.
Fragmentation?
I wonder the following: What are the best practices to measure the fragmentation of an existing index? Can Luke help me in that?
It would be very interesting to hear your thoughts about this analysis topic.
A bit more infos about our index:
- We have indexed 400'000 documents
- We heavily use properties per document
- For each request we create a new searcher object (as we want changes to appear immediately in the search results)
- Query performance is between 30ms (repeated same searches) and 10 seconds (complex)
- The index consists of 44 files (15 .del files, 24 cfs files) and has a size of 1GB