1
votes

I'm using apache-solr-3.4.0. I'm able to search using a single word, but couldn't search using more than one word. For example: jobTitle:tester produce the results, but jobTitle:java developer doesn't return any result.

In my schema.xml I added like the below code for Text field type:

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
      <tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize=  "5"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
      <tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize="5"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>
2
Try surrounding in quotes the terms: jobTitle:"java developer" otherwise the query would be interpreted at jobTitle:java _text_:developer. You can also pass "debugQuery=true" to your query and see in the results how Solr is parsing your query and make sure it is doing what you would expect.Hector Correa
I agree with @Hector... I missed that part...Abhijit Bashetti

2 Answers

1
votes

You have several options, sorted by ease of use

  1. Use brackets ( and ) around parts of the query that shall go to one field to group them, e.g. jobTitle:(java developer). Do not simply put quotes " around them, this executes as phrase query that is something different.
  2. Define an alternate default field using local params per query, e.g. {!df=jobTitle}java developer. This will make all parts of your query go to that field.
  3. Specify a better default search field in your solrconfig.xml per request handler, this requires a restart after configuration.
  4. Make use of the eDismax or Dismax query handler as default, define the fields that the search input shall go against. You can imagine those as an extension to option (2), where you have multiple default fields. This will require you to alter your solrconfig.xml, but will not require you to rebuild your index.
  5. Improve the content of your default field, make it a better catch all field that contains all content of all fields or at least all relevant fields. This will require you to think about your schema design, alter the schema.xml and rebuild your index.

Background
Imagine that Solr splits your search query in parts at each blank (in reality it is not that simple, but good enough for a start). Each part is treated against either an assigned field or the default field. Taken from Solr's manual

The field is only valid for the term that it directly precedes, so the query title:Do it right will find only "Do" in the title field. It will find "it" and "right" in the default field (in this case the text field).

0
votes

Solr has also an NGramFilterFactory. N-gram filter. Try not use the ngram tokenizer. I would suggest to use the "WhitespaceTokenizer" and then apply ngram filters.

<filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="3" />

your field type should be something like this :

<fieldType name="text_custom" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
  <filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="10" />
  <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
  <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>