
This is my field type declared in schema:

<fieldType name="c_string" class="solr.TextField">
 <analyzer type="index">
  <tokenizer class="solr.KeywordTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />
  <filter class="solr.ReversedWildcardFilterFactory" />
 <analyzer type="query">
  <tokenizer class="solr.KeywordTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />

I can search using wildcards without any problems. But I have some problems with highlight feature. Solr highlights entire and not only matched phrase. For example my search query is title:Keyword*. So solr will only display documents matching wilcard. But highlight is:

"title": [
        "<em>Keyword and the rest of title</em>"

but I want:

"title": [
        "<em>Keyword</em> and the rest of title"

This works as I want if I use solr.EdgeNGramFilterFactory like this:

<fieldType name="text_general_edge_ngram" class="solr.TextField" positionIncrementGap="100">
   <analyzer type="index">
      <tokenizer class="solr.LowerCaseTokenizerFactory"/>
      <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
   <analyzer type="query">
      <tokenizer class="solr.LowerCaseTokenizerFactory"/>

If I use it, highlight is ok, but wildcards are ignored. Solr always searches like with wildcards, title:Keyword title:Keyword* works the same - obviously title:Keyword should not match anything.

Do you have any tips?

[added] Example query:


Example highlight result:

      "text_dsc":["<em>14276|\nDzień dobry -  dokument testowy. \n\n \n\nTEST. \n\n\n</em>"]},
      "text_dsc":["<em>14276|\nDzień dobry -  dokument testowy. \n\n \n\nTEST. \n\n\n</em>"]},
      "text_dsc":["<em>14276|\nDzień dobry -  dokument testowy. \n\n \n\nTEST. \n\n\n</em>"]}}}

As you can see, query string is dobry, but entire field is highlighted. Why? If I use solr.EdgeNGramFilterFactory as mentioned above, with the same query highlight is correct but searching is incorrect (always wildcard)

Can you please post an example query, especially the highlighting parameters?lxg
Question updated. Query is generated by solr webadmin interface.user1209216

1 Answers


Use StandardTokenizerFactory and you will get the desired output:

<fieldType name="c_string" class="solr.TextField">
 <analyzer type="index">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />
  <filter class="solr.ReversedWildcardFilterFactory" />
 <analyzer type="query">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />

The difference between the StandardTokenizerFactory and KeywordTokenizerFactory factory is very well explained in this question: difference between StandardTokenizerFactory and KeywordTokenizerFactory in SoLR


Index text_dsc in two different fields like

   <fieldType name="text_dsc" class="solr.TextField">
 <analyzer type="index">
  <tokenizer class="solr.KeywordTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />
  <filter class="solr.ReversedWildcardFilterFactory" />
 <analyzer type="query">
  <tokenizer class="solr.KeywordTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />

<fieldType name="text_dsc_standard" class="solr.TextField">
 <analyzer type="index">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />
  <filter class="solr.ReversedWildcardFilterFactory" />
 <analyzer type="query">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.ASCIIFoldingFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory" />

And in your search query set hl.fl=text_dsc_standard.