7
votes

I have followed an example from here

The mapping for the index is

{
  "mappings": {
    "my_type": {
      "properties": {
        "full_text": {
          "type":  "string" 
        },
        "exact_value": {
          "type":  "string",
          "index": "not_analyzed" 
        }
      }
    }
  }
}

And the document indexed is

{
  "full_text":   "Quick Foxes!", 
  "exact_value": "Quick Foxes!"  
}

I have noticed while using a simple match query on the "full_text" field like below

{
  "query": {
    "match": {
      "full_text": "quick"
    }
  }
}

I get to see the document is matching. Also if I use uppercase, that is "QUICK" , as the search term, it shows the document is matching.

Why is it so?. By default the tokenizer would have splitted the text in "full_text" field in to "quick","foxes". So how is match query matching the document for upper cased values?

1
I don't get your problem. Do you mean when 'quick" or "QUICK" you laways get the same documents ? - CodeNotFound
Yes. While using "match" query both words found matched. But while using "term" query, the upper case words doesnt find matches,which is quite understandable as in term query we are looking inside the analyzed terms. Is the query string "QUICK" getting converted to "quick" while using match query? - Arun Mohan
I answered to your question below. Please accept it as an anaswer if it satisfied you ;-) - CodeNotFound
@CodeNotFound Sorry I didnt notice the answer :) - Arun Mohan

1 Answers

8
votes

Because you haven't specified which analyzer to use for "full_text" field into your index mapping then the default analyzer is used. The default will be "Standard Analyzer".

Quote from ElasticSearch docs:

An analyzer of type standard is built using the Standard Tokenizer with the Standard Token Filter, Lower Case Token Filter, and Stop Token Filter.

Before executing the query in your index, ElasticSearch will apply the same analyzer configured for your field to your query values. Because the default analyzer uses Lower Case Token Filter in its processing then using "Quick" or "QUICK" or "quick" will give you to the same query because the analyzer will lower case them by using the Lower Case Token Filter and result to just "quick".