We are using Solr 8.4.1 to search the products from documents. I want exact phrase to come on top, but I also want if same phrase is repeated many times in document then it should only be counted once. Right now those keywords having same phrase multiple times in document comes on top because they're getting a higher score.
Please see the result below given i am searching for "pipes"
. Seven results found, but prod_id:297720
named my pipes pipes
have a different score from prod_id:3064
.
As per our need both should have the same score. I want to ignore the repeated phrase found in the documents.
We are using similarity class <similarity class="solr.BM25SimilarityFactory"/>
Searching field schema given below as
<field name="product_related_kword" type="text_general" indexed="true" stored="true" />
And the result is:
{
"responseHeader":{
"status":0,
"QTime":222,
"params":{
"q":"product_related_kword:pipes",
"fl":"product_name,score,product_related_kword,prod_id,member_classified_slno,member_id",
"start":"0",
"rows":"100",
"debugQuery":"on"}},
"response":{"numFound":7,"start":0,"maxScore":2.7593598,"docs":[
{
"prod_id":297720,
"product_name":"my pipes pipes",
"member_classified_slno":123457327,
"member_id":"11111327",
"product_related_kword":"my pipes pipes 00",
"score":2.7593598},
{
"prod_id":3064,
"product_name":"pipes",
"member_classified_slno":123457560,
"member_id":"11119579",
"product_related_kword":"pipes 00",
"score":2.5436506},
{
"prod_id":3064,
"product_name":"pipes",
"member_classified_slno":123457544,
"member_id":"11113186",
"product_related_kword":"pipes 00",
"score":2.5436506},
{
"prod_id":3064,
"product_name":"pipes",
"member_classified_slno":123457546,
"member_id":"11113636",
"product_related_kword":"pipes 00",
"score":2.5436506},
{
"prod_id":3064,
"product_name":"pipes",
"member_classified_slno":123457551,
"member_id":"11119238",
"product_related_kword":"pipes 00",
"score":2.5436506},
{
"prod_id":3064,
"product_name":"pipes",
"member_classified_slno":123457553,
"member_id":"785565531",
"product_related_kword":"pipes 00",
"score":2.5436506
}
]
},
tf
always returning 1. – MatsLindh