Indexing lucene document with different analysers

Question

Is it okay to index the lucene documents with two different analysers? Like i need to support both case-sensitive and case-insensitive search. So wondering if I can use two analysers for same document.

writer.addDocument(doc,new StandardAnalyzer(Version.LUCENE_30)); writer.addDocument(doc,new custom_analyser);

I am planning to have a custom analyser that supports all filter the standard analyser does except for lowercase filter. While I try to search results from indices, I think we might end up getting duplicates..

Any comment/ideas?

EDIT: @Simon

Analyzer defaultAnalyzer = new StandardAnalyzer(Version.LUCENE_30);
PerFieldAnalyzerWrapper wrapper = new PerFieldAnalyzerWrapper(defaultAnalyzer);
wrapper.addAnalyzer("CaseSensitiveContents", new WhitespaceAnalyzer());

writer = new IndexWriter(FSDirectory.open(index), wrapper, true, 
                         new IndexWriter.MaxFieldLength(100))

doc.add(new Field("contents", parser.getReader(), TermVector.YES));
doc.add(new Field("CaseSensitiveContents", parser.getReader(), TermVector.YES));
writer.add(doc)

sisve sisve · Accepted Answer · 2011-02-11T17:43:31

Your example code would add two almost identical documents (except their casing) to your index.

How about adding two fields to one document, one being case sensitive, one not? You can use the PerFieldAnalyzer for this.

Indexing lucene document with different analysers

1 Answers