6
votes

I'm trying to do a fuzzy match on the Phrase "Grand Prarie" (deliberately misspelled) using Apache Lucene. Part of my issue is that the ~ operator only does fuzzy matches on single word terms and behaves as a proximity match for phrases.

Is there a way to do a fuzzy match on a phrase with lucene?

3

3 Answers

7
votes

Lucene 3.0 has ComplexPhraseQueryParser that supports fuzzy phrase query. This is in the contrib package.

4
votes

Came across this through Google and felt solutions where not what I was after. In my case, solution was to simply repeat the search sequence against the solr API. So for example if I was looking for: title_t to include match for "dog~" and "cat~", I added some manual code to generate query as:

((title_t:dog~) and (title_t:cat~))

It might just be what above queries are about, however links seems dead.

3
votes

There's no direct support for a fuzzy phrase, but you can simulate it by explicitly enumerating the fuzzy terms and then adding them to a MultiPhraseQuery. The resulting query would look like:

<MultiPhraseQuery: "grand (prarie prairie)">