Elasticsearch newbie question. I loaded shakespeare.json into Elastic, and I'm trying to figure out how to do an aggregation analogous to select speaker, count(1) from line group by speaker
. ("Line" is the type of document, and "speaker" is one of the properties.)
Now I have a query like this:
{
"size": 0,
"query": {
"query": {
"match": {
"play_name": "HAMLET"
}
}
},
"aggs": {
"line_count": {
"terms": {
"field": "speaker.speaker_raw"
}
}
}
}
The results look right, but the ElasticSearch docs specify that document counts for the terms aggregation are approximate (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html). Is there some other magic to get exact counts within a bucket?
Separately, I already figured out that I had to pre-define a field on the index to have an un-analyzed version of "speaker" to ensure I can aggregate on the original field values, not tokenized. (See Elasticsearch - Cardinality over Full Field Value)
terms
aggregations, the only approximate values are (IIRC) for cardinality and percentiles aggregations. See: elastic.co/guide/en/elasticsearch/reference/current/… And: elastic.co/guide/en/elasticsearch/reference/current/… - Or Weinbergersize:0
it should be accurate, what do you think? - Or Weinberger