3
votes

I have a problem with the default Solr scoring algorithm that's specific the domain of my collection. In my domain, documents that contain all the query terms or most query terms are substantially more relevant than documents that contain only a few terms. I would like to boost the score of documents so that the more terms that match, the higher the score. I'm aware to the fact that solr already boosts such documents by multiplying the score by the coordination factor. However, the coordination factor is not significant enough for me, and I wish to raise it to a certain power. I'm also familiar with the ExtendedDismax parser's Minimum-Should-Match feature, but that feature doesn't solve my problem because I don't want to eliminate the documents that don't match enough terms, I just want to "punish" them.

Is there a way to increase the significance of the coordination factor? I'll also accept other solutions that don't make any use of the coordination factor if they solve the problem.

1

1 Answers

1
votes

It might be easiest to just write your own similarity. You can override the coord method with whatever you like, and the implementation of it is pretty simple really Something like:

public class MySimilarity extends DefaultSimilarity {
    @Override
    public float coord(int overlap, int maxOverlap) {
        return super.coord(overlap, maxOverlap)^2;
    }
}

You can bring in your own similarity implementation in the schema:

<similarity class="this.is.MySimilarity"/>