0
votes

I have a large ES index with dynamically created "keyword" fields. I need to enable case-insensitive search on these. I understand analyzer is not available for keyword fields, and normalizer is to be used for it: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-normalizers.html

Is there a way to dynamically add normalizers to fields/mappings? I am able to add an analyzer to existing text fields by closing the index, adding the analyzer and reopening the index. This does not seem to work on existing indexes while adding normalizers. Is there a way to do this other than creating another index to re-index all the data?

Here are my steps: Create a test index with the lowercase normalizer:

curl -XPUT localhost:9200/ganesh_index/ -d '
{
  "settings": {
    "analysis": {
      "normalizer": {
        "useLowercase": {
          "type": "custom",
          "filter": [ "lowercase" ]
        }
      }
    }
  },
  "mappings":{
     "ganesh_type":{
        "properties":{
           "title":{
              "normalizer":"useLowercase",
              "type":"keyword"
           }
        }
     }
  }
}'

Now, I can insert and query as desired:

curl -X PUT localhost:9200/ganesh_index/ganesh_type/1 -d '{"title":"ThisFox.StatusCode1"}'
curl -X PUT localhost:9200/ganesh_index/ganesh_type/2 -d '{"title":"ThisFox.StatusCode2"}'

curl -X POST 'localhost:9200/ganesh_index/_search?pretty' -d '{"query": {"regexp":{"title": "this.*code1"}}}'
{
  "took" : 24,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ganesh_index",
        "_type" : "ganesh_type",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "title" : "ThisFox.StatusCode1"
        }
      }
    ]
  }
}

However, if my index already exists like so:

curl -X PUT localhost:9200/ganesh_index -d '
{
  "settings": {
    "index": {
      "number_of_shards": 2,
      "number_of_replicas": 2
    }
  }
}'

and I insert records, I am unable to add the normalizer later.

curl -XPUT localhost:9200/ganesh_index/?pretty -d '
> {
>   "settings": {
>     "analysis": {
>       "normalizer": {
>         "useLowercase": {
>           "type": "custom",
>           "filter": [ "lowercase" ]
>         }
>       }
>     }
>   },
>   "mappings":{
>      "ganesh_type":{
>         "properties":{
>            "title":{
>               "normalizer":"useLowercase",
>               "type":"keyword"
>            }
>         }
>      }
>   }
> }'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "index_already_exists_exception",
        "reason" : "index [ganesh_index/mg5TckzaR5KZDE-FphTeDg] already exists",
        "index_uuid" : "mg5TckzaR5KZDE-FphTeDg",
        "index" : "ganesh_index"
      }
    ],
    "type" : "index_already_exists_exception",
    "reason" : "index [ganesh_index/mg5TckzaR5KZDE-FphTeDg] already exists",
    "index_uuid" : "mg5TckzaR5KZDE-FphTeDg",
    "index" : "ganesh_index"
  },
  "status" : 400
}

Is there any way to add the normalizer for an existing index (on keyword fields)?

2

2 Answers

0
votes

No, you will have to either re-index it or create a new mapping.

0
votes

Currently, Elasticsearch doesn't give support for such activity. Even if you did it will give us a message with this.

 {
  "error": {
    "root_cause": [
      {
        "type": "resource_already_exists_exception",
        "reason": "index [category_video_autocomplete_3/FkxOwP_RQMW_L077hYLPJg] already exists",
        "index_uuid": "FkxOwP_RQMW_L077hYLPJg",
        "index": "category_video_autocomplete_3"
      }
    ],
    "type": "resource_already_exists_exception",
    "reason": "index [category_video_autocomplete_3/FkxOwP_RQMW_L077hYLPJg] already exists",
    "index_uuid": "FkxOwP_RQMW_L077hYLPJg",
    "index": "category_video_autocomplete_3"
  },
  "status": 400
}

The message looks quite complicated but a closer look tells

resource_already_exists_exception

meaning the resource you want to create already exists, so we can not create the same resource, here resource means index named category_video_autocomplete_3.