0
votes

I am writing a python script to get unique values in the elasticsearch index. I am using term aggregation to get the unique values and their counts. However when I pass a list of fields to the script I realized some of the fields are stored as
"abc" : {
            "type" : "keyword"
        }

and some are stored as

"xyz" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword"
              }
            }
          }

During term aggregation I use the query

{
    "aggs" : {
        "abc" : {
            "terms" : {
                "field" : "abc"
            }
        }
    }, "size":0
}

But when this query is used on "xyz" it gives error Fielddata is disabled on text fields by default. Set fielddata=true on [description] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.

To run the query for "xyz" I need to add ".keyword" to it but then "abc" won't run.

Is there any way in which I can check which field belongs to which type and then using if/else update the query accordingly?

1

1 Answers

0
votes

You can have both -- the field being aggregatable and searchable w/o the .keyword notation. Simply adjust your mapping as the error msg suggests:

"xyz" : {
   "type" : "text",
   "fielddata": true
}

then reindex & you're good to go.

As to whether there's a query-time check to determine which fields are which -- there's none. One of the core principles of ElasticSearch is that field types are predetermined and defined so that they're indexed appropriately so that the search/aggregations are optimized. Thus it's assumed at query time that you know which fields are of which type.