1
votes

I am trying to sort the buckets of a terms aggregation in elasticsearch case-insensitive. Here is the field mapping:

'brandName'       => [
    'type'     => 'string',
    'analyzer' => 'english',
    'index'    => 'analyzed',
    'fields'   => [
        'raw' => [
            'type'  => 'string',
            'index' => 'not_analyzed'
        ]
    ]
]

Note that this data structure here is for PHP.

And the aggregation looks like this:

aggregations => [
    'brands' => [
        'terms' => [
            'field' => 'brandName.raw',
            'size'  => 0,
            'order' => ['_term' => 'asc']
        ]
    ]
]

This works, but the resulting buckets are in lexicographical order.

I found some interesting docs here that explained how to do this, but it is in the context of sorting the hits, not the aggregations buckets.

I tried it anyway. Here is the analyzer I created:

'analysis' => [
    'analyzer' => [
        'case_insensitive_sort' => [
            'tokenizer' => 'keyword',
            'filter' => [ 'lowercase' ]
        ]
    ]
]

And here is the updated field mapping, with a new sub-field called "sort" using the analyzer.

'brandName'       => [
    'type'     => 'string',
    'analyzer' => 'english',
    'index'    => 'analyzed',
    'fields'   => [
        'raw' => [
            'type'  => 'string',
            'index' => 'not_analyzed'
        ],
        'sort' => [
            'type'  => 'string',
            'index' => 'not_analyzed',
            'analyzer' => 'case_insensitive_sort'
        ]
    ]
]

And here's the updated aggregation portion of my query:

aggregations => [
    'brands' => [
        'terms' => [
            'field' => 'brandName.raw',
            'size'  => 0,
            'order' => ['brandName.sort' => 'asc']
        ]
    ]
]

This generates the following error: Invalid term-aggregator order path [brandName.sort]. Unknown aggregation [brandName].

Am I close? Can this kind of aggregation bucket sorting be done?

1

1 Answers

2
votes

The short answer is that this kind of advanced sorting on aggregations is not yet supported and there is an open issue that is tackling this (slated for v2.0.0).

There are two other points worth mentioning here:

  1. the brandName.sort sub-field being declared as not_analyzed, it's contradictory to also set an analyzer at the same time.

  2. The error you're getting is because the order part can only refer to sub-aggregation names, not field names (i.e. brandName.sort is a field name)