Elasticsearch: aggregate on two fields

Question

Not sure how to formulate the question. I'm using Elasticsearch 2.2.

Let's start with an example of the dataset, made of 5 documents:

[
  {
    "header": {
      "called_entity": { "uuid": "a" },
      "coverage_entity": {},
      "sucessful_transfers": 1
    }
  },
  {
    "header": {
      "called_entity": { "uuid": "a" },
      "coverage_entity": { "uuid": "b" },
      "sucessful_transfers": 1
  }
  },
  {
    "header": {
      "called_entity": { "uuid": "b" },
      "coverage_entity": { "uuid": "a" },
      "sucessful_transfers": 1
    }
  },
  {
    "header": {
      "called_entity": { "uuid": "b" },
      "coverage_entity": { "uuid": "a" },
      "sucessful_transfers": 0
    }
  }
]

called_entity always has a uuid. coverage_entity can be empty, or have an uuid.

What I want is to aggregate on either called_entity.uuid or coverage_entity.uuid, and then count the total amount of documents and the sum of successful_transfers. So, for these 5 documents, I would have something like that as a result:

uuid,doc_count,successful_transfers_count
"a",4,3
"b",3,2

The problem is that it means a same document can be used on several aggregations, as long as the aggregation key is either in called_entity.uuid or coverage_entity.uuuid (I'm not even sure if that's possible, which is why I'm posting here).

What I'm currently doing is simply aggregating on the called_entity.uuid field, but of course that's not enough:

{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "dim_1": {
      "terms": {
        "field": "header.called_entity.uuid",
        "size": 0
      },
      "aggs": {
        "successful_transfers": {
          "sum": {
            "field": "header.successful_transfers"
          }
        }
      }
    }
  }
}

Which gives me something like:

uuid,doc_count,successful_transfers_count
"a",2,2
"b",2,1

...Which is not what I want. So, how can I aggregate on several values, or for a given aggregation, compute data based on values present in all the documents (not only the one in the aggregation)?

Thank you.

Andrei Stefan Andrei Stefan · Accepted Answer · 2016-07-20T13:52:33

{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "dim1": {
      "terms": {
        "script" : "return doc['header.called_entity.uuid'] + doc['header.coverage_entity.uuid']",
        "size": 10
      },
      "aggs": {
        "successful_transfers": {
          "sum": {
            "field": "header.successful_transfers"
          }
        }
      }
    }
  }
}

Elasticsearch: aggregate on two fields

1 Answers