2
votes

How to make field case insensitive in lucene? Suppose, I have a following document:

User:xyz

Now, the document should be returned as a result for queries "user:xyz", "uSer:xyz", or "usEr:xyz".

The possible solution is to lowercasing the field while indexing and searching. But I need the exact value of the field when retrieving the document. Also, the other solution is to index the field twice, but that is also not the proper solution.

Here is the lucene example. When the query is "user:xyz" the document doesn't match. But if I use query "User:xyz" then the document matches because while indexing I have field as "User".

public void testFieldCaseSensitive() throws ParseException,
        QueryNodeException {
    StandardQueryParser parser = new StandardQueryParser();
    Query luceneQuery = parser.parse("user:xyz","");
    MemoryIndex memoryIndex = new MemoryIndex();
    memoryIndex.addField("User", "xyz", new StandardAnalyzer(
            Version.LUCENE_43));
    memoryIndex.search(luceneQuery);
    Assert.assertTrue(memoryIndex.search(luceneQuery) > 0);
}
2
If I'm not wrong it's already case insensitive I guess?Umar Iqbal
Can you provide some example? I have done with the values, but not the field itself.puneet
Yeah sure, see my answer in a moment.Umar Iqbal
Sorry, didn't understand your question correctly when I first read it. Should have read more carefully.femtoRgon

2 Answers

6
votes

Field names are case sensitive. As far as I know, there is no switch to flip in order to make them otherwise.

Probably the most reasonable way to approach this would be to ensure that, when indexing documents, all field names are lowercased. Then, when querying, if you aren't querying against any case-sensitive field, you can just use String.toLowercase(), to make the entire query string lowercase as well, effectively making it case-insensitive.

-1
votes

Apache Lucene is already case Insensitive what ever you search(Case sensitive or insensitive) it'll bring you results.

Basically the indexing you use already covers it and in most cases it's StandardAnalyzer. I've just tested it.

For Searching :

DocSearchEngine searcher = new DocSearchEngine();
ScoreDoc[] hits = searcher.searchIndexWithQueryParser("SeArch TeXT");
List<ResStructure> resultSet = searcher.printResultList(hits);

For Indexing :

writer = new IndexWriter(FSDirectory.open(new File(indexDir)),
    new IndexWriterConfig(Version.LUCENE_45 ,new StandardAnalyzer(Version.LUCENE_45)));