This is my field type declared in schema:
<fieldType name="c_string" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.ReversedWildcardFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
I can search using wildcards without any problems. But I have some problems with highlight feature. Solr highlights entire and not only matched phrase. For example my search query is title:Keyword*
. So solr will only display documents matching wilcard. But highlight is:
"title": [
"<em>Keyword and the rest of title</em>"
but I want:
"title": [
"<em>Keyword</em> and the rest of title"
This works as I want if I use solr.EdgeNGramFilterFactory like this:
<fieldType name="text_general_edge_ngram" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.LowerCaseTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.LowerCaseTokenizerFactory"/>
</analyzer>
</fieldType>
If I use it, highlight is ok, but wildcards are ignored. Solr always searches like with wildcards, title:Keyword
title:Keyword*
works the same - obviously title:Keyword
should not match anything.
Do you have any tips?
[added] Example query:
select?q=text_dsc%3A*dobry*&rows=200&wt=json&indent=true&hl=true&hl.fl=text_dsc&hl.simple.pre=<em>&hl.simple.post=<%2Fem>
Example highlight result:
"highlighting":{
"25352":{
"text_dsc":["<em>14276|\nDzień dobry - dokument testowy. \n\n \n\nTEST. \n\n\n</em>"]},
"25353":{
"text_dsc":["<em>14276|\nDzień dobry - dokument testowy. \n\n \n\nTEST. \n\n\n</em>"]},
"26693":{
"text_dsc":["<em>14276|\nDzień dobry - dokument testowy. \n\n \n\nTEST. \n\n\n</em>"]}}}
As you can see, query string is dobry, but entire field is highlighted. Why? If I use solr.EdgeNGramFilterFactory as mentioned above, with the same query highlight is correct but searching is incorrect (always wildcard)