SOLR index time boost depending on the field value

Question

Is it possible to boost a document on the indexing stage depending on the field value?

I'm indexing a text field pulled from the database. I would like to boost results that are shorter over the longer ones. So the value of boost should depend on the length of the text field.

This is needed to alter the standard SOLR behavior that in my case tends to return documents with multiple matches first.

Considering I have a field that stores the length of the document, the equivalent in the query of what I need at indexing would be:

q={!boost b=sqrt(length)}text:abcd

Example: I have two items in the DB:

ABCDEBCE
ABCD

I always want to get ABCD first for the 'BC' query even though the other item contains the search query twice.

The other solution to the problem would be ability to 'switch off' the feature that scores multiple matches higher at query time. Don't know if that is possible either...

Doing this at index time is important as the hardware I run the SOLR on is not too powerful and trying to boost on query time returns with OutOfMemory Exception. (Even If I could work around that increasing memory for java I prefer to be on the safe side and implement the index the most efficient way possible.)

MatsLindh MatsLindh · Accepted Answer · 2012-11-26T13:41:26

Yes and no - but how you do it depends on how you're indexing your documents.

As far as I know there's no way of resolving this only on the solr server side at the moment.

If you're using the regular XML based interface to submit documents, let the code that generates the submitted XML add boost=".." values to the field or to the document depending on the length of the text field.

SOLR index time boost depending on the field value

2 Answers