0
votes

Newbie question on elasticsearch. I have set up the elasticsearch lucene index and use searching for names that contain some term, such as

search_response = es.search(index = 'sample', body = {'query':{'match':{'first_name':"JUST"}}})

This does not return me the name "JUSTIN" but the following query does

search_response = es.search(index = 'sample', body = {'query':{'match':{'first_name':"JUSTIN"}}})

What am I doing wrong? Shouldn't "match" query return me the records that contain the term? Thanks.

1
Possible duplicate of Elasticsearch: Find substring matchBlackPOP
No, match query will not get you records when you search for JUST. you can try with JUST* in wildcard queryRicha

1 Answers

0
votes

The best way to handle that need is by creating a custom analyzer which uses the edgeNGram token filter. Forget about wildcards and using * in query strings, those all underperform the edgeNGram approach.

So you'd have to create your index like this first and then reindex your data into it.

curl -XPUT http://localhost:9200/sample -d '{
    "settings": {
        "analysis": {
            "filter": {
                "prefixes": {
                    "type": "edgeNGram",
                    "min_gram": 1,
                    "max_gram": 15
                }
            },
            "analyzer": {
                "my_analyzer": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": ["lowercase", "prefixes"]
                }
            }
        }
    },
    "mappings": {
        "your_type": {
            "properties": {
                "first_name": {
                    "type": "string",
                    "analyzer": "my_analyzer",
                    "search_analyzer": "standard"
                }
            }
        }
    }
}'

Then when indexing first_name: JUSTIN, you'll get the following indexed tokens: j, ju, jus, just, justi, justin, basically all prefixes of JUSTIN.

You'll then be able to search with your second query and actually find what you expect.

search_response = es.search(index = 'sample', body = {'query':{'match':{'first_name':'JUST'}}})