I've got an index that currently occupies about 1gb of space and has about 2.5 million documents. The index is stored on a solid-state drive for speed. I'm adding 2500 documents at a time and committing after each batch has been added. The index is a "live" index and needs to be kept up-to-date throughout the day and night, so minimising write speeds is very important. I'm using a merge factor of 10 and am never calling Optimize()
, rather allowing the index to optimize itself as needed based on the merge factor.
I need to commit the documents after each batch has been added because I record this fact so that if the app crashes or restarts, it can pick up where it left off. If I didn't commit, the stored state would be inconsistent with what's in the index. I'm assuming my additions, deletions and updates are lost if the writer is destroyed without committing.
Anyway, I've noticed that after an arbitrary period of time, which could be anywhere from two minutes or two hours and some variable number of previous commits, the indexer seems to stall on the IndexWriter.AddDocument(doc)
method and I can't for the life of me figure out why it's stalling or how to fix it. The block can stay in place for upwards of two hours, which seems strange for an index taking up less than 2GB in the low millions of documents and having an SSD drive to work with.
What could cause AddDocument to block? Are there any Lucene diagnostic utilities that could help me? What else could I look for to track down the problem?