1
votes

I need to prevent certain fields which have values like "null" (null as a string) and ""(empty string) from getting indexed in Elasticsearch i.e. I should be able to fetch the document which doesn't contain those fields in _source field.

Are there any settings that are required in mappings while indexing, like using an custom analyzer on a field?

P.S:- I am using elasticsearch 7.6.1

I tried below answer and this is how it didn't work -

{  "settings": {
"number_of_shards": "5",
"analysis": {
  "normalizer": {
    "my_normalizer": {
      "char_filter": [
        {
          "type": "mapping",
          "mappings": [
            "null =>",
            "\"\"\" =>"
          ]
        }
      ],
      "filter": [
        "uppercase"
      ],
      "type": "custom"
    }
  }
},
"number_of_replicas": "1"}}

Response error - only value lists are allowed in serialized settings

While even when I tried as below settings I didn't get expected outcome:

{  "settings": {
"number_of_shards": "5",
"analysis": {
  "char_filter": {
    "my_filter": {
      "type": "mapping",
      "mappings": [
        "null =>",
        "\"\"\" =>"
      ]
    }
  },
  "normalizer": {
    "my_normalizer": {
      "char_filter": [
        "my_filter"
      ],
      "filter": [
        "uppercase"
      ],
      "type": "custom"
    }
  }
},
"number_of_replicas": "1"}}

request - GET indexname/_analyze

{"normalizer":"my_normalizer","text":"null"}

response -

{
"tokens": [
    {
        "token": "",
        "start_offset": 4,
        "end_offset": 4,
        "type": "word",
        "position": 0
    }
]

}

expected response -

{
"tokens": []
}
1
Please let me know if it works? if yes, please don't forget to upvote and accept answer - user156327

1 Answers

0
votes

Using mapping char filter in your analyzer definition, you can achieve it, below is the working example.

Analyze API

{
  "tokenizer": "standard",
  "char_filter": [
    {
      "type": "mapping",
      "mappings": [
        "null =>",
        "\"\"\" =>"
      ]
    }
  ],
  "text": "null" or "" --> note this
}

And returned token

{
    "tokens": []
}