0
votes

In our SOLR implementation, we are using grouping/field collapsing to make sure that the type-ahead results are unique. We have content that could have the same display term, but a difference codes behind them (multiple fields involved).

For the most part, this works fine. The field we are grouping on is a standard StrField. However, where this falls apart is when the display term is in different cases (ie: solr vs SOLR).

How can I make grouping case insensitive? The other catch is that we don't want to tokenize the string into multiple words. For example:

The terms are "SOLR rocks", "solr rocks", and "SOLR is awesome". The results should be "SOLR rocks" and "SOLR is awesome" for searches for SOLR. If the stream is tokenized, then all three would be grouped together.

Thanks

1

1 Answers

1
votes

Use a Non Tokenized, Lower Case String field for Grouping.
This should ensure the Grouping is Case Insensitive.

e.g. Field Type Configuration

<fieldType name="lowercase" class="solr.TextField" positionIncrementGap="100">
  <analyzer>
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory" />
  </analyzer>
</fieldType>