i indexed my database to lucene for full text search. everything works fine when searching for keywords which has no symbols but whenever i search for keywords having slashes, decimals, etc. (i.e. 1/4, 1.234, 1-1/4") lucene returns no search results. what is the best way to do in indexing symbols?
0
votes
What analyzer are you using? Are those symbols in text fields or separate fields?
– Thomas
@Thomas is correct, you are likely using StandardAnalyzer which strips out most punctuation and symbols. You could pass a custom stopwords list or write a custom analyzer to suit your needs.
– Mikos
i use standardanalyzer. the symbols are on the same field. if standardanalyzer strips out symbols, what will be the best analyzer to use?
– maccramers
i have an idea but i am not sure if it will work. i am planning to modify the stop words of standardanalyzer by disable all stop words except for spaces. i tried whitespaceanalyzer for my code but it didnt work. how will i implement it?
– maccramers
3 Answers
3
votes
1
votes
1
votes
Fortunately, newer versions of Lucene already have a convenience method for escaping the said characters in the form of a static method called escape(String s) in QueryParser.
From the docs:
public static String escape(String s)
Returns a String where those characters that QueryParser expects to be escaped are escaped by a preceding \.