Elasticsearch java api get average of terms aggregation

Question

I'm using elasticsearch with java api and I'm trying to get average value of lowest record from each bucket of term aggregation. One solution I found is to get results like this

AggregationBuilders.terms("group_by_flights").field("flight_id)
    .subAggregation(AggregationBuilders.min("minimum").field("duration")))

and then count average on the code side. The problem is that if there will be lot of result, it will allocate a lot of memory to count it. I would like to do this on elastic side. I found, that there is something like avg bucket pipeline aggregation, which can be add as sibling aggregation to terms (and others)

"the average": {
  "avg_bucket": {
    "buckets_path": "some_bucket_path" 
  }
}

Problem is that in java api you can add pipeline aggregation only as subaggregation. So if we construct our aggregation like this our terms aggregation won't be seen

AggregationBuilders.terms("group_by_flights").field("flight_id")
    .subAggregation(PipelineAggregatorBuilders.avgBucket("avg", "group_by_flights.duration" *<- this wont't be seen because its subaggregation*))

I was thinking about making some empty top aggregation and then add all aggregations as subaggregations, but it seems like silly walk-around, and I'm not understanding something correctly. Any ideas?

Paweł Sosnowski Paweł Sosnowski · Accepted Answer · 2018-11-06T15:17:16

The only solution I found so far is to make aggregations as sub aggregation of "empty aggregation"

AggregationBuilders.global("global_aggregation")
    .subAggregation((AggregationBuilders.terms("group_by_flights").field("flight_id"))
        .subAggregation(AggregationBuilders.min("min").field("duration")))
    .subAggregation(PipelineAggregatorBuilders.avgBucket("avg_bucket_aggs","group_by_flights>min"))

Elasticsearch java api get average of terms aggregation

2 Answers