
I am using Solr for searching institutions... My Solr DB has around 400k documents each of which has multiple fields like ("name","id","city",...)...

A document in my DB looks like this:

    "id": "91348",
    "p_code": "71637",
    "name": "University of Toronto - Mississauga",
    "ext_name": "",
    "city": "Mississauga",
    "country": "CA",
    "state": "ON",
    "type": "academic/campus",
    "alt_name": "",
    "ext_city": "",
    "zip": "L5L 1C6",
    "alt_ext_city": "",

I write a query like {name: (university of toronto)}... Top two matches are:

    "id": "91348",
    "p_code": "71637",
    "name": "University of Toronto - Mississauga",
    "ext_name": "",
    "city": "Mississauga",
    "country": "CA",
    "state": "ON",
    "type": "academic/campus",
    "alt_name": "",
    "ext_city": "",
    "zip": "L5L 1C6",
    "alt_ext_city": "",
    "_version_": 1473710223400108000,
    "score": 1.499069

    "id": "10624",
    "p_code": "7938",
    "name": "University of Toronto",
    "ext_name": "",
    "city": "Toronto",
    "country": "CA",
    "state": "ON",
    "type": "academic",
    "alt_name": "Saint George Downtown Campus",
    "ext_city": "",
    "zip": "M5S 1A1",
    "alt_ext_city": "",
    "_version_": 1473710220148473900,
    "score": 1.4967358

I am really surprised to see that "University of Toronto - Mississauga" returns a higher score than "university of Toronto". Intuitively, the field containing "University of Toronto - Mississauga" should get a lower score since it is longer than the other one.

I was also very surprised to see that Solr gives different values for querynorm as follows: (0.03198291 = queryNorm) for the top document and (0.03203078 = queryNorm) for the second ranked document. I presumed that the query norm should be exactly the same for the all documents as it is only a function of the query.

I am not sure if I got something wrong about how Solr works or there is something wrong in indexing or configuration? Has anybody faced the same problem?

How does your complete query string look? .. and are we talking a single server, or are there sharding or SolrCloud involvement?MatsLindh
As far as why the shorter term doesn't get a boost in score, my best guess is that you have omitNorms=true. Favoring shorter fields when scoring, as you've mentioned, relies on having norms stored.femtoRgon

1 Answers


Make sure that omitNorms is set to false for that field and that your collection is using the latest version of the schema. Then re-index all of your documents for the change to the field to take effect.

I've found that some schema modifications are best treated with a complete wipe of the index prior to indexing in new content. I am not sure, but I believe this may be one of them. For most of the changes you can just re-index all of your content and overwrite the old stuff.