0
votes

I have what I believe is a use case that many using elastic search would like to have. Here is my template

PUT _template/test 
{
    "template" : "test*",
    "settings" : {
        "number_of_shards" : 5,
        "number_of_replicas" : 1
    },
    "mappings" : {
      "test": {
        "properties": {
          "name": {
            "type": "string",
            "index": "analyzed"
          },
        "description": {
            "type": "string",
            "index": "analyzed",
            "analyzer":"english",
            "fields": {
              "raw": {
                "type": "string",
                "index": "not_analyzed"
              }
            }
          }
        }
      }
   }
}

Now I'm going to put a single record in the index

POST /test/test
{
    "name":"test-1",
    "description":"on the first day of christmas my true love gave to me a partridge in a pear tree"
}

Now imagine I have a million of these records. What I want to do is that if I search for on the on the description field I would like nothing to come back because those are common words that the english analyzer should take care of. However if I do a search for exact text "on the" then I would like documents to return that match the exact text.

My question to the elastic community is how do I allow for this and what should the query look like? I added the .raw field for description but no matter what my query string is I can't get the exact text to return any results.

1
You have to query it like GET /test/test/_search { "query": { "term": { "description.raw": "on the" } }} - Richa
when I run that exact query I get { "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 0, "max_score": null, "hits": [] } } - Casey Johnson
That may be because you don't have any description with that exact text - Richa
but I only have one document added which starts with the text "on the first day.... " so shouldn't it match "on the"... I tried to make this example as basic as possible. If I change the query to "on the first" then it returns because first won't be removed by the english analyzer but I should be able to search description.raw with "on the" and get a result - Casey Johnson
@Richa ahhh I see your point. It returns nothing because it doesn't have the exact text only part of the exact text. - Casey Johnson

1 Answers

0
votes

Your first requirement is already satisfied by english analyzer. Now come to second problem where if "on the" is passed you want to get the full matched document. In case of second problem search should be done in "description. raw" field. Mark your field "raw": { "type": "string", "index": "analyzed" } and here default analyzer is standard so you will get the entire document where “on” or “the” or “on the” matched but if you want to match exact document containing "on the” word configure the new custom analyzer for "description.raw" field using “edge ngram” tokenizer. More details can be found in below link https://www.elastic.co/guide/en/elasticsearch/guide/current/_index_time_search_as_you_type.html