I am experimenting with boosting in Solr and have become confused how my document scores are being affected.
I have a collection of technical documents that contain fields like Title, Symptoms, Resolution, Classification, Tags
, etc. All the fields listed are required except Tags
which is optional. All fields are copied to _text_
and that field is the default search field.
When I run a default query
http://search:8983/solr/articles-experimental/select?defType=edismax&fl=id,%20tags,%20score&q=virtualization&qf=_text_
The top article (Article 42014) comes back with a score of 4.182179
. This document has 6 instances of the word virtualization
in multiple fields -- Title, Symptoms, Resolution, and Classification. This particular article does not have any Tags value.
I now want to experiment with boosting so that articles that have Tag values matching the search terms appear closer to the top of the results. To do this, I send the following query
http://search:8983/solr/articles-experimental/select?defType=edismax&fl=id,tags,score&q=virtualization&qf=tags^2%20_text_
which keeps the same Article 42014 at the top of the list but now with a score of 4.269944
. However, results 2 through 65 now all have the same score of 4.255975
. In the non-boosted query the scores range from 4.056591
down to 2.7029662
.
In addition, the collection of document id coming back are not quite the same as before. I certainly expect some differences but not the extent that I am seeing considering that the vast majority of the articles coming back have the search term as a tag.
Ultimately, I am having trouble finding out exactly how boosting changes the score and what is an "appropriate" boost value. Understanding that it is probably subjective, what criteria should I be considering?
debugQuery=true
to your query, and it'll show you exactly how the score is being calculated as well. It'll show what values are being multiplied or added together. explain.solr.pl is useful for visualizing these values. – MatsLindh