In Lucene, I can use fuzzy search to get 'similar' results.
For example, following query:
text:awesome~0.8
Will find the documents having 80% similar texts, like 'awesom'.
My question is, can I use fuzzy search on entire text (multiple words)?
For example, I want to find out 80% similar texts to following text:
this is my text with multiple words
Putting fuzzy clause on each word would not give me desired results:
text:(+this~0.8 +is~0.8 +my~0.8 +text~0.8 +with~0.8 +multiple~0.8 +words~0.8)
As it would return only those documents which has all the words (or 80% similar words against each word) specified in query.
I expect query to return me results where entire string is 80% similar (even if it doesn't have an entire word), for example:
this is text with multiple words
Something like this -
text:(+this +is +my +text +with +multiple +words)~0.8
Obviously above query gives syntax error, but I need to get results based on similarity on entire text/phrase.
I am happy to use Java API classes for this purpose as I need to use it in a Java program.