10
votes

Is it possible to extract the list of all the terms in a Lucene index as a list of strings? I couldn't find that functionality in the doc. Thanks!

2

2 Answers

17
votes

In Lucene 4 (and 5):

 Terms terms = SlowCompositeReaderWrapper.wrap(directoryReader).terms("field"); 

Edit:

This seems to be the 'correct' way now (Lucene 6 and up):

LuceneDictionary ld = new LuceneDictionary( indexReader, "field" );
BytesRefIterator iterator = ld.getWordsIterator();
BytesRef byteRef = null;
while ( ( byteRef = iterator.next() ) != null )
{
    String term = byteRef.utf8ToString();
}
12
votes

Lucene 3: