I have an application that accept free text searchs for users. Suppose an user writes "one two three" in a html input text, so my search URI is ".../solr/my_index/select?q=expressions:(one two three)...".
Documents are described in schema as follows:
<field name="id" type="int" indexed="true" stored="true" required="true" />
<field name="expressions" type="text_general" indexed="true" stored="true" multiValued="true" />
In "my_index" I have two documents indexed:
id:"1", expressions: ["seven one two three four five", "seven eight seven", "two six nine six"]
id:"2", expressions: ["one", "one two", "one two four", "four one two one"]
The result of the query is that document id=2 have bigger score because of more matches of the words "one" and "two". But I have more specific requirements: the SCORE must consider not match count, but "similarity in the search phrase". So, because the document id=1 has a value "seven one two three four five", with the "..one two three..." substring inside the value, and this is very similar to the phrase written by the user, document id=1 must have the bigger SCORE.
Can this be done? I am very new to SOLR/Lucene, so I don't know if I need to use an specific query parser, build a custom one...
Thanks.