There are two fields in my schema:
field1
is using keyword
tokenizer filter that preserves the tokens as it is (not even dividing on space. I double checked that in analysis tab.)
field2
is using WhitespaceTokenizerFactory
that breaks the words on spaces and tabs etc.
<field name="field1" type="field1_type" indexed="true" stored="false"/>
<field name="field2" type="field2_type" indexed="true" stored="false"/>
<fieldType name="field2_type" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> </analyzer> </fieldType>
I am using edismax
parser with default qf
value= field1
field2
Now when I'm querying with q=hello world
In deugging mode its showing that its making query like
rawquerystring:hello world
querystring:hello world parsedquery:(+((DisjunctionMaxQuery((field1:hello | field2:hello)) DisjunctionMaxQuery((field1:world | field2:world)))~1) ())/no_coord
parsedquery_toString:+(((field1:hello | field2:hello) (field1:world | field2:world))~1) ()
What I expected was something like this:
expected:+(((field1:hello world) ((field2:hello) (field2:world))~1) ()
i.e. for field1
it should not break the query on space as it is using keyword tokenizer while it should break the query on space for field2
.
Can you please tell what am I doing wrong?