Reading from elastic documentation:
the
match_phrase
query first analyzes the query string to produce a list of terms. It then searches for all the terms, but keeps only documents that contain all of the search terms, in the same positions relative to each other.
I have configured my analyzer to use edge_ngram with keyword tokenizer :
{
"index": {
"number_of_shards": 1,
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
}
}
Here is the java class that is used for indexing :
@Document(indexName = "myindex", type = "program")
@Getter
@Setter
@Setting(settingPath = "/elasticsearch/settings.json")
public class Program {
@org.springframework.data.annotation.Id
private Long instanceId;
@Field(analyzer = "autocomplete",searchAnalyzer = "autocomplete",type = FieldType.String )
private String name;
}
if I have the following phrase in document "hello world", the following query will match it :
{
"match" : {
"name" : {
"query" : "ho",
"type" : "phrase"
}
}
}
result : "hello world"
that's not what I expect because not all of the search terms in the document.
my questions :
1- shouldn't I have 2 search terms in the edge_ngram/autocomplete for the query "ho" ? (the terms should be "h" and "ho" respectively. )
2- why does "ho" match "hello world" when all of the terms according to the definition of phrase query didn't match ? ("ho" term shouldn't have match)
update:
just in case that the question is not clear. The match phrase query should analyze the string to list of terms , here it's ho
. Now we will have 2 terms as this is edge_ngram with 1
min_gram. The 2 terms are h
and ho
. according to elasticsearch the document must contain all of the search terms. However hello world
has h
only and doesn't have ho
so why I did get a match here ?
name
field. (2) You haven't specified any sample doc, so we don't know against what data you are try to match. Clarify these points so that people here can help you better. – Nishant