My ElasticSearch index has nested documents to indicate the places where various events occurred related to the document. I am using aggregations to get facets of the places. The count returned is the count of the number of occurrences of the place. For example, if a document has a birth and death place of California, the aggregation count for California is 2. I would like the aggregation count to be the number of documents containing a particular place, rather than the number of child documents containing the place. The relevant part of my schema looks like this:
"mappings": {
"document": {
"properties": {
"docId" : { "type": "keyword" },
"place": {
"type": "nested",
"properties": {
"id": { "type": "keyword" },
"type": { "type": "keyword" },
"loc": { "type" : "geo_point" },
"text": {
"type": "text",
"analyzer": "english",
"copy_to" : "text"
}
},
"dynamic": false
}
}
}
}
I can get facets with a simple aggregation like this, which retrieves the places with type place.vital.* (e.g. place.vital.birth, place.vital.death, etc), but counts the number of nested documents, not the number of parent documents.
"aggs": {
"place.vital": {
"aggs": {
"types": {
"aggs": {
"values": {
"terms": {
"field": "place.id"
}
}
},
"terms": {
"field": "place.type",
"include": "place\\.vital\\..*"
}
}
},
"nested": {
"path": "place"
}
}
Is it possible to tweak my aggregation so that it only counts each parent document once?