7
votes

My indexer, using Lucene, seems to crash during indexing operations after writing an index file approximately 16GB in size.

The stack trace written to the console is repeated three times for reasons I don't know. For brevity I've only supplied the single part that's repeated. Here's the stack trace as written to the conolse by Lucene:

Lucene.Net.Index.MergePolicy+MergeException: Exception of type 'Lucene.Net.Index.MergePolicy+MergeException' was thrown. --->

System.IO.FileNotFoundException: Could not find file 'PATH_TO_MY_INDEX_DIRECTORY\_xx.cfs'.

File name: 'PATH_TO_MY_INDEX_DIRECTORY\_xx.cfs'
at Lucene.Net.Index.IndexWriter.HandleMergeException(Exception t, OneMerge merge)
at Lucene.Net.Index.IndexWriter.Merge(OneMerge merge)
at Lucene.Net.Index.ConcurrentMergeScheduler.MergeThread.Run()
--- End of inner exception stack trace ---
at Lucene.Net.Index.ConcurrentMergeScheduler.HandleMergeException(Exception exc)
at Lucene.Net.Index.ConcurrentMergeScheduler.MergeThread.Run()
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()

When I open the generated log with the Java edition of Luke the index is deleted (presumably because it's corrupted, the "write.lock" file remains, for example), though this could be a bug or misconfiguration of Luke.

Creating this index takes approximately 36 hours and I'm not keen on having to do it again for the third time (this isn't the first time it's happened).

I have no idea what's causing this. What can I do?

I'm using Lucene.net 2.9.2 because it's the last version that was built for .NET 3.5.

3
are you indexing to a local drive?Jf Beaulac
Yes, it's a local drive. There are no other processes using the index files either and my indexing program has a single IndexWriter instance.Dai
a possible reason for this in lucene java is that you run out of file handles, i am not sure it applies to Lucene.net thoJf Beaulac

3 Answers

4
votes

I realised that this was caused by writing too much to the index without calling Commit. I modiifed my code to call Commit after writing about 10MB of data. I haven't had the exception since - and when it does crash it means I don't need to rebuild the entire 36GB index, just the last 10MB.

1
votes

It took awhile to find, but this turned out (in my case) to be caused by the local hard disk being full. A more useful exception message would have been helpful.

0
votes

Right, VERY late to the party but I had the same issue as part of this exception:

Exception Info: Lucene.Net.Index.CorruptIndexException
   at Lucene.Net.Index.IndexWriter.HandleMergeException(System.Exception, 
OneMerge)
   at Lucene.Net.Index.IndexWriter.Merge(OneMerge)
   at Lucene.Net.Index.ConcurrentMergeScheduler+MergeThread.Run()

Exception Info: Lucene.Net.Index.MergePolicy+MergeException
   at Lucene.Net.Index.ConcurrentMergeScheduler.HandleMergeException(System.Exception)

    at Lucene.Net.Index.ConcurrentMergeScheduler+MergeThread.Run()
    at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
    at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
    at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
    at System.Threading.ThreadHelper.ThreadStart()

I resolved this by deleting all the indexes and rebuilding them.