Shortly: with Elasticsearch, given a list of fields, how can I get the average number of missing fields per document as an aggregation?
Details
With the missing
aggregation type I can get the total number of documents where a given field is missing. So with the following data:
"hits": [{
"name": "A name",
"nickname": "A nickname",
"bestfriend": "A friend",
"hobby": "An hobby"
},{
"name": "A name",
"hobby": "An hobby"
},{
"name": "A name",
"nickname": "A nickname",
"hobby": "An hobby"
},{
"name": "A name",
"bestfriend": "A friend"
}]
I can run the following query:
{
"aggs": {
"name_missing": {
"missing": {"field": "name"}
},
"nickname_missing": {
"missing": {"field": "nickname"}
},
"hobby_missing": {
"missing": {"field": "hobby"}
},
"bestfriend_missing": {
"missing": {"field": "bestfriend"}
}
}
}
And I get the following aggregations:
...
"aggregations": {
"name_missing": {
"doc_count": 0
},
"nickname_missing": {
"doc_count": 2
},
"hobby_missing": {
"doc_count": 1
},
"bestfriend_missing": {
"doc_count": 1
}
}
...
What I need now is to get the average number of missing fields for each document. I can just do the math by code on the results:
- sum all the
missing
aggregationsdoc_count
value - divide by the total number of hits
But how would you get the same result as an aggregation from Elasticsearch?
Thank you for any reply / suggestion.