Elasticsearch newbie question. I loaded shakespeare.json into Elastic, and I'm trying to figure out how to do an aggregation analogous to select speaker, count(1) from line group by speaker. ("Line" is the type of document, and "speaker" is one of the properties.)
Now I have a query like this:
{
"size": 0,
"query": {
"query": {
"match": {
"play_name": "HAMLET"
}
}
},
"aggs": {
"line_count": {
"terms": {
"field": "speaker.speaker_raw"
}
}
}
}
The results look right, but the ElasticSearch docs specify that document counts for the terms aggregation are approximate (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html). Is there some other magic to get exact counts within a bucket?
Separately, I already figured out that I had to pre-define a field on the index to have an un-analyzed version of "speaker" to ensure I can aggregate on the original field values, not tokenized. (See Elasticsearch - Cardinality over Full Field Value)
termsaggregations, the only approximate values are (IIRC) for cardinality and percentiles aggregations. See: elastic.co/guide/en/elasticsearch/reference/current/… And: elastic.co/guide/en/elasticsearch/reference/current/… - Or Weinbergersize:0it should be accurate, what do you think? - Or Weinberger