4
votes

I'm using dateHistogram aggregation with ElasticSearch Java API, and it works pretty well for simple aggregations, such as the number of hits per hour/day/month/year (imagine a series of documents, where the date histogram aggregation is made on 'indexed_date' field).

But, can I, with a single query, make a multi-field date aggregation, in relation to another field? Something like what Kibana does for charts.

An example of what I would like to achieve:

I have a series of documents, where each one is an "event", which has its timestamp. These documents have a series of fields, like "status", "version", etc.

Can I get an aggregation, based on date histogram, on timestamp field and on all values of another field?

Example result of aggregation with a one hour interval:

H: 12 status - { ACTIVE: 34 PAUSED: 12 }

H: 13 status - { ACTIVE: 10 }

EDIT:

Some sample data:

"doc1" - { timestamp: "2014-12-23 12:01", status: "ACTIVE", version: 1 }
"doc2" - { timestamp: "2014-12-23 12.15", status: "PAUSED", version: 1 }
"doc3" - { timestamp: "2014-12-23 13.55", status: "ACTIVE", version: 2 }
(and so on..)
2
Just to confirm what you're looking for - you want to have hourly buckets (date histogram) and each bucket contains a count of something? e.g. a count of fields with "active": true, or "paused": true ? if you could add some data to the question it would be easier to figure it out.Olly Cruickshank
Yes, this is what I'm looking for. I'm editing the question to add a bit more data samples.Carmine Giangregorio

2 Answers

4
votes

I would do a term aggregation inside the date histogram.

in the below example you can see document counts returned for each different status type:

curl -XGET 'http://localhost:9200/myindex/mydata/_search?search_type=count&pretty' -d '
> {
>  "query" : {
>     "match_all" : { } 
>   },
>     "aggs" : {
>         "date_hist_agg" : {
>             "date_histogram" : {"field" : "timestamp", "interval" : "hour"},
>             "aggs" : {
>              "status_agg" : {
>                 "terms" : { "field" : "status" }
>             }
>           }
>        }     
>      }
> }'
{
  "took" : 213,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "date_hist_agg" : {
      "buckets" : [ {
        "key_as_string" : "2014-12-23T17:00:00.000Z",
        "key" : 1419354000000,
        "doc_count" : 2,
        "status_agg" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [ {
            "key" : "active",
            "doc_count" : 1
          }, {
            "key" : "paused",
            "doc_count" : 1
          } ]
        }
      }, {
        "key_as_string" : "2014-12-23T18:00:00.000Z",
        "key" : 1419357600000,
        "doc_count" : 1,
        "status_agg" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [ {
            "key" : "active",
            "doc_count" : 1
          } ]
        }
      } ]
    }
  }
}
1
votes

Using the same aggregation names used in the previous answer, I would do the following:

    public void yourSearch(String indexName, String typeName) {

        SearchResponse sr =  client.prepareSearch(indexName)
                .setTypes(typeName)
                .addAggregation(AggregationBuilders.dateHistogram("date_hist_agg")
                                .field("timestamp")
                                .interval(DateHistogram.Interval.hours((1)))
                                .minDocCount(0)
                        .subAggregation(AggregationBuilders.terms("status_agg").field("status")))
            .execute().actionGet();

        DateHistogram componentsAgg =  sr.getAggregations().get("date_hist_agg");
        for (DateHistogram.Bucket entry : componentsAgg.getBuckets()) {

            Terms statusAgg =  entry.getAggregations().get("status_agg");
            for (Terms.Bucket entry2 : statusAgg.getBuckets()) {
                String key = entry2.getKey();
                long cnt = entry2.getDocCount();

                // use the key,cnt

            }
        }
    }
}