Solr lower case filter

Question

I'm trying to make a spellchecker in Solr and I'm having an issue with case. The problem is changing the case of the query doesn't affect the number of results returned, but it changes the spellchecker results. For example, if I type 'leave' then I get 7 document results and no spellchecker results. But if I search 'Leave' then I still get 7 document results but now spellcheck has these results:

"spellcheck":{
"suggestions":[
  "Leave",{
    "numFound":3,
    "startOffset":0,
    "endOffset":5,
    "origFreq":0,
    "suggestion":[{
        "word":"leave",
        "freq":7},
      {
        "word":"lease",
        "freq":4},
      {
        "word":"travel",
        "freq":2}]}],
"correctlySpelled":true,
"collations":[
  "collation",{
    "collationQuery":"leave",
    "hits":7,
    "misspellingsAndCorrections":[
      "Leave","leave"]}]}

Suggesting lower case 'leave'. Notice it still says 'correctlySpelled' is true. Here's the fields and field types from my schema.xml:

<field name="title"         type="text_en"  indexed="true"  stored="true"   multiValued="false" />
<field name="filename"      type="string"   indexed="true"  stored="true"   multiValued="false" />
<field name="filext"        type="string"   indexed="true"  stored="true"   multiValued="false" />
<field name="version"       type="int"      indexed="false" stored="true"   multiValued="false" />
<field name="docSet"        type="string"  indexed="true"  stored="true"   multiValued="false" />
<field name="businessArea"  type="string"  indexed="true"  stored="true"   multiValued="false" />
<field name="processGroup"  type="string"  indexed="true"  stored="true"   multiValued="false" />
<field name="applicability" type="string"  indexed="true"  stored="true"   multiValued="true"  />
<field name="content"       type="text_en"  indexed="true" stored="true"  multiValued="false" />
<field name="lastIndex"     type="int"      indexed="true" stored="true"   multiValued="false" />
<field name="popularity"    type="int"      indexed="true"  stored="true"   multiValued="false" default="1"/>

<field name="speller"    type="speller_type"  indexed="true"  stored="true"  multiValued="true"  />

<copyField source="*" dest="speller"/>

<fieldType name="speller_type" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords_en.txt"/>
  </analyzer>

  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords_en.txt"/>
  </analyzer>
</fieldType>

And this is the spellchecking parts of my solrconfig.xml:

<requestHandler name="/select" class="solr.SearchHandler">
  <lst name="defaults">

    ...

    <!--****************************************************************
    *   Spellcheck configuration
    *****************************************************************-->
    <str name="spellcheck">on</str>
    <!-- Suggestions -->
    <str name="spellcheck.count">10</str>
    <!-- <str name="spellcheck.maxResultsForSuggest">10</str> -->
    <str name="spellcheck.extendedResults">true</str>
    <!-- Collations -->
    <str name="spellcheck.collate">true</str>
    <str name="spellcheck.maxCollationTries">5</str>
    <str name="spellcheck.collateExtendedResults">true</str>
    <str name="spellcheck.collateMaxCollectDocs">0</str>

    ...

  </lst>

  <arr name="last-components">
    <str>spellcheck</str>
  </arr>
</requestHandler>


<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
    <lst name="spellchecker">
      <str name="classname">solr.IndexBasedSpellChecker</str>
      <str name="spellcheckIndexDir">./spellchecker</str>
      <str name="field">speller</str>
      <str name="buildOnCommit">true</str>
    </lst>
</searchComponent>

If I'm applying a lower case filter to the speller field then why would changing the case while searching change the results from the spellchecker? I've looked for solutions for this but can't find anything that has fixed it.

Thanks for any help.

EDIT: I get the same problem with stopwords, they're not being applied. Even though 'for' is a stopword in stopwords.txt and I'm applying to the speller fieldType, if I type 'leave for application' it suggests 'leave form application' as a collation query. Why aren't the stop words being removed?

Jayce444 Jayce444 · Accepted Answer · 2016-11-07T03:15:56

Alright I fixed it. I changed the index based checker in the solr config to the direct one and now everything works fine i.e. changed this

<str name="classname">solr.IndexBasedSpellChecker</str>
<str name="spellcheckIndexDir">./spellchecker</str>

To this:

<str name="classname">solr.DirectSolrSpellChecker</str>

Not sure why the index based one ignored the filters, I'll have to look into the documentation.

Solr lower case filter

1 Answers