Solr and Haystack autocomplete with spaces

Question

I am trying to get an autocomplete / auto-suggest function working, and have encountered a problem with Haystack (latest master) and Solr (6.6.6).

I am using Haystack's autocomplete() function, which requires the indexed field to be an EdgeNgram (or Ngram). The autocomplete queries work fine until I have a space and start the beginning of a second word.

For example:

"st" yields ["Star Wars", "Starlight Express"...]
"wa" yields ["Star Wars", "Waterworld"...]
"star" yields ["Star Wars", "Starlight Express"...]

However, as soon as I get to a space and the start of a second word I get no results:

"star w" yields no results

From my investigations so far this seems to be because Haystack is converting the two word phrase into an AND based two word query. "star AND w", or (AND: ('title', 'star'), ('title', 'w')). The combination of the AND operator and the second query term "w", which is not a valid stem, means that no results are returned.

I could override Haystacks autocomplete to use the OR operator to partially fix this...

But, is there a better approach / solution?

Ideally I would like the search for "star w" to return "star wars" (and not all films starting with W, which the OR operator could cause).

Other search functionality seems to be working fine, so it is not a general configuration problem - but seems to be specific to the nature of the autocomplete query / use case.

How can I configure Solr / use Haystack to get the desired autocomplete responses that span two words with a space?

can you try by having the fieldtype for you field as string or use the keywordtokenizer along with the lowercase filter — Abhijit Bashetti

Abhijit Bashetti Abhijit Bashetti · Accepted Answer · 2020-08-14T12:31:19

You can use below fieldtype for you suggest field. You can then use wildcards when querying: star w*

<fieldType name="suggestionFieldType" class="solr.TextField" sortMissingLast="true" omitNorms="true">
      <analyzer>
        <!-- KeywordTokenizer does no actual tokenizing, so the entire
             input string is preserved as a single token
          -->
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <!-- The LowerCase TokenFilter does what you expect, which can be
             when you want your sorting to be case insensitive
          -->
        <filter class="solr.LowerCaseFilterFactory" />
      </analyzer>
    </fieldType>

Or you can also try with below field type for you field. In the below case the

Acceptable partial search phrases would be:

s
st
sta
star
star w
star wa

and so on...

<fieldType name="suggestion_text" class="solr.TextField" positionIncrementGap="100">    
  <analyzer type="index"> 
    <tokenizer class="solr.KeywordTokenizerFactory"/>       
       <filter class="solr.EdgeNGramFilterFactory" maxGramSize="100"/> 
  </analyzer>
  <analyzer type="query"> 
    <tokenizer class="solr.KeywordTokenizerFactory" /> 
  </analyzer> 
</fieldType>

Solr and Haystack autocomplete with spaces

1 Answers