I'm currently running haystack with an elasticsearch backend, and now I'm building an autocomplete for cities names. The problem is that SearchQuerySet is giving me different results, which from my perspective are wrong, than the same query executed directly in elasticsearch, which are for me the expected results.
I'm using: Django 1.5.4, django-haystack 2.1.0, pyelasticsearch 0.6.1, elasticsearch 0.90.3
Using the following example data:
- Midfield
- Midland City
- Midway
- Minor
- Minturn
- Miami Beach
Using either
SearchQuerySet().models(Geoname).filter(name_auto='mid')
or
SearchQuerySet().models(Geoname).autocomplete(name_auto='mid')
The result returns always all the 6 names, including Min* and Mia*...however, querying elasticsearch directly returns the right data:
"query": {
"filtered" : {
"query" : {
"match_all": {}
},
"filter" : {
"term": {"name_auto": "mid"}
}
}
}
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "haystack",
"_type": "modelresult",
"_id": "csi.geoname.4075977",
"_score": 1,
"_source": {
"name_auto": "Midfield",
}
},
{
"_index": "haystack",
"_type": "modelresult",
"_id": "csi.geoname.4075984",
"_score": 1,
"_source": {
"name_auto": "Midland City",
}
},
{
"_index": "haystack",
"_type": "modelresult",
"_id": "csi.geoname.4075989",
"_score": 1,
"_source": {
"name_auto": "Midway",
}
}
]
}
}
The behavior is the same with different examples. My guess is that trough haystack the string it's being split and analyzed by all possible "min_gram" groups of characters and that's why it returns wrong results.
I'm not sure if I am doing or understanding something wrong, and if is this how haystack is supposed to work, but I need that haystack results match the elasticsearch results.
So, How can I fix the issue or make it works ?
My summarized objects look as follow:
Model:
class Geoname(models.Model):
id = models.IntegerField(primary_key=True)
name = models.CharField(max_length=255)
Index:
class GeonameIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
name_auto = indexes.EdgeNgramField(model_attr='name')
def get_model(self):
return Geoname
Mapping:
modelresult: {
_boost: {
name: "boost",
null_value: 1
},
properties: {
django_ct: {
type: "string"
},
django_id: {
type: "string"
},
name_auto: {
type: "string",
store: true,
term_vector: "with_positions_offsets",
analyzer: "edgengram_analyzer"
}
}
}
Thank you.