1
votes

I want to perform a search request and only receive unique fields from the documents, ordered by a specific field. In my case I want the result ordered by a timestamp and I only need the field refId. To get only the ids, I use aggregation.

Example of the request from the Elastic Dev Tools

POST /MY-INDEX/_search
{
  "size": 0
  "query": {
    "bool": {
      //my query
    }
  },
    "sort" : [
    { "timestamp": {"order" : "asc"}}
  ],
  "aggs": {
    "agg_id": {
      "terms": {
        "field": "refId"
      }
    }
  }
}

However, the order of the aggregation does not match the order of the search results. Is there possibility to have the same order for aggregations?

---- UPDATE ---- Example of data

{
  "timestamp": 1604582511657,
  "id": 4
  "refID": "ref3"
}
{
  "timestamp": 1604582511655,
  "id": 3
  "refID": "ref1"
}
{
  "timestamp": 1604582511654,
  "id": 2
  "refID": "ref1"
}
{
  "timestamp": 1604582511653,
  "id": 1
  "refID": "ref2"
}

Search Result

"aggregations": {
    "unique_id": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "ref1",
          "doc_count": 2,
        },
        {
          "key": "ref2",
          "doc_count": 1,
        },
        {
          "key": "ref3",
          "doc_count": 1,
        }
      ]
    }
  }

Expected Result:

 "buckets": [
        {
          "key": "ref2",
          "doc_count": 2,
        },
        {
          "key": "ref1",
          "doc_count": 1,
        },
        {
          "key": "ref3",
          "doc_count": 1,
        }
      ]

The aggregations result is ordered by doc_count, but not like the search query by the timestamp

1
can you please share some sample index data and expected search result ?ESCoder
@Bhavya added an example for data and resultClemens
Based on your example data the doc_count for "refID": "ref1" is 2, how come you are showing doc_count for "refID": "ref2" as 2 ?ESCoder
can you please share your expected search result as well ?ESCoder
Sorry this was a copy and paste error. Thank you for your solution, it is exactly what I wanted.Clemens

1 Answers

2
votes

You can use terms aggregation along with max aggregation to achieve your required use case

Adding a working example with index data, search query, and search result

Index Data:

{
  "timestamp": 1604582511657,
  "id": 4
  "refID": "ref3"
}
{
  "timestamp": 1604582511656,
  "id": 3
  "refID": "ref1"
}
{
  "timestamp": 1604582511654,
  "id": 2
  "refID": "ref1"
}
{
  "timestamp": 1604582511655,
  "id": 1
  "refID": "ref2"
}

Search Query:

  {
  "size": 0,
  "aggs": {
    "unique_id": {
      "terms": {
        "field": "refID.keyword",
        "order": {
          "latestOrder": "desc"
        }
      },
      "aggs": {
        "latestOrder": {
          "max": {
            "field": "timestamp"
          }
        }
      }
    }
  }
}

Search Result:

"aggregations": {
    "unique_id": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "ref3",
          "doc_count": 1,
          "latestOrder": {
            "value": 1.604582511657E12
          }
        },
        {
          "key": "ref1",
          "doc_count": 2,
          "latestOrder": {
            "value": 1.604582511656E12
          }
        },
        {
          "key": "ref2",
          "doc_count": 1,
          "latestOrder": {
            "value": 1.604582511655E12
          }
        }
      ]
    }