The best way to handle that need is by creating a custom analyzer which uses the edgeNGram token filter. Forget about wildcards and using *
in query strings, those all underperform the edgeNGram approach.
So you'd have to create your index like this first and then reindex your data into it.
curl -XPUT http://localhost:9200/sample -d '{
"settings": {
"analysis": {
"filter": {
"prefixes": {
"type": "edgeNGram",
"min_gram": 1,
"max_gram": 15
}
},
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "prefixes"]
}
}
}
},
"mappings": {
"your_type": {
"properties": {
"first_name": {
"type": "string",
"analyzer": "my_analyzer",
"search_analyzer": "standard"
}
}
}
}
}'
Then when indexing first_name: JUSTIN
, you'll get the following indexed tokens: j
, ju
, jus
, just
, justi
, justin
, basically all prefixes of JUSTIN.
You'll then be able to search with your second query and actually find what you expect.
search_response = es.search(index = 'sample', body = {'query':{'match':{'first_name':'JUST'}}})
match query
will not get you records when you search forJUST
. you can try with JUST* inwildcard query
– Richa