0
votes

I am having trouble creating an elastisearch query with aggregations.

This is my data:

user |rank|comment
----------------- 
john |1   |too bad 
john |2   |almost 
john |3   |well done
james|8   |awesome  
james|3   |poor

I am interested in the highest rank for every user AND the corresponding comment, like this:

user |rank|comment
-----------------
john |3   |well done
james|8   |awesome

Here is the code for my elastic data:

PUT /myindex/_doc/1
{
    "user" : "john",
    "rank" : 1,
    "comment" : "too bad"
}

PUT /myindex/_doc/2
{
    "user" : "john",
    "rank" : 2,
    "comment" : "almost"
}

PUT /myindex/_doc/3
{
    "user" : "john",
    "rank" : 3,
    "comment" : "well done"
}

PUT /myindex/_doc/4
{
    "user" : "james",
    "rank" : 8,
    "comment": "awesome"
}

PUT /myindex/_doc/5
{
    "user" : "james",
    "rank" : 3,
    "comment": "poor"
}

And here is my aggregations query:

GET  /myindex/_doc/_search
{
  "size":0, 
  "aggs": {
    "by": {
      "terms": {
        "field": "user.keyword"
      },
      "aggs": {
        "maxrank": {
          "max": {
            "field": "rank"
          }
        }
      }
    }
  }
}

It gives me this:

  {
      "took": 1,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
    "total": 5,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "by": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "john",
          "doc_count": 3,
          "maxrank": {
            "value": 3
          }
        },
        {
          "key": "james",
          "doc_count": 2,
          "maxrank": {
            "value": 8
          }
        }
      ]
    }
  }
}

So that looks fine, but how can I get the corresponding comment field into the query result? If I was on a SQL database, I would create the aggregations part as a subquery and join that to the base table. How can I achieve this on elasticsearch?

1

1 Answers

2
votes

Instead of using a max metric aggregation on rank, I would simply use the top_hits one like this:

GET  /myindex/_doc/_search
{
  "size":0, 
  "aggs": {
    "by": {
      "terms": {
        "field": "user.keyword"
      },
      "aggs": {
        "maxrank": {
          "top_hits": {
            "_source": ["rank", "comment"],
            "sort": {"rank": "desc"},
            "size": 1
          }
        }
      }
    }
  }
}