2
votes

I have a text field in my web app that allows users to find other people by name. You start typing in the box and the server sends back possible matches as you type. I set up Haystack / Solr backend with a simple search index:

class UserIndex(SearchIndex):
    text = NgramField(document=True, model_attr='get_full_name')

site.register(BerlinrUser, UserIndex)

Then I run manage.py build_solr_schema, copy the schema to my solf/conf directory, restart solr, then finally run manage.py update_index.

In my django view, I have the following search code:

q = request.GET.get("q")
search_result = SearchQuerySet().models(User).autocomplete(content=q)
for result in search_result:
    # Collect user IDs, pull from database, and send back a JSON result

The problem is that the autocomplete does not return what I want. Given this collection of users:

John Smith
John Jingleheimer-Schmidt
Jenny Smith

Here is the behavior that I want:

Query:        Expected result:
"John"        "John Smith",                     
              "John Jingleheimer-Schmidt"
"Smith"       "John Smith",
              "Jenny Smith"
"Smi"         "John Smith",
              "Jenny Smith"
"J"           <Anybody with a first or last name that begins with "J">
"John Sm"     "John Smith"

Note, it is acceptable for the query "ohn Smi" to not return anything, as opposed to matching "John Smith".

However, using Haystack/Solr, "Smi", "J", and "John Sm" return no results at all. To get Haystack/Solr to return anything, I have to use whole words. According to the Haystack documentation, I should use the NgramField to match across word boundaries, but it doesn't appear to be doing that. Any ideas?

1

1 Answers

2
votes

Found out why the query wasn't working as expected.

My first problem was in the index definition. It should have looked like this:

class UserIndex(SearchIndex):
    text = CharField(document=True)
    name_auto = NgramField(model_attr='get_full_name')

Another problem was in my Solr schema.xml file. The minGramSize was set to "3", which would prevent queries under 3 characters from working. Setting it to "1", restarting Solr, and rebuilding the index fixed the problem.