3
votes

Whilst building some unit tests for my Lucene queries I noticed some strange behavior related to punctuation, in particular around parentheses.

What are some of the best ways to deal with search fields that contain significant amounts of punctuation?

2

2 Answers

3
votes

If you haven't customized the query parser, Lucene should behave according to the default query parser syntax. Are you getting something different than that? Do you want punctuation to have a special meaning or just to remove the punctuation from searches? The other usual suspect here is the Analyzer, which determines how your field is indexed and how the query is broken into pieces for searching. Can you post specific examples of bad behavior?

1
votes

It is not not just parentheses, other punctuations such as the colon, hyphen etc. will cause issues. Here is a way to deal with them.