0
votes

I'm trying to run a significant terms aggregation on documents that have been tagged with keywords. The problem is that some of these tags consist of multiple words. An example of such a tag might be 'markup languages', however the aggregation handles such tags as two tags: 'markup' and 'languages'. Is there a way to run a significant terms aggregation on the tags field that handles the multi-word tags correctly? The query I am using is below:


    {
            "query": {
                "terms": {
                    "display": [
                        true
                    ]
                }
            },
            "size": 0,
            "aggregations": {
                "significantTags": {
                    "significant_terms": {
                        "field": "tags",
                        "size": 100
                    }
                }
            }
        }

1

1 Answers

0
votes

This turned out to be an indexing problem. Re-indexing the data with the field 'tags' mapped to the type 'keyword' instead of 'text' resolved the issue. The tags are now treated as keywords and the aggregation works as expected.