With Elasticsearch I know I can do some nice time series data queries and get mean/max etc
Is it possible though to only include the 90% percentile in that calculation and in Kibana in particular?
Any thoughts on how this could be done?
With Elasticsearch I know I can do some nice time series data queries and get mean/max etc
Is it possible though to only include the 90% percentile in that calculation and in Kibana in particular?
Any thoughts on how this could be done?
Elasticsearch doesn't currently support percentiles (including median).
Percentiles are much harder to compute than statistics in a distributed environment. Let's assume you have 2 shards. If you ask both of them for the sum of their values and the number of values, you would be able to know the global average value: ($sum1 + $sum2) / $(value_count1 + $value_count2)
.
On the other hand, if you want to compute the median, the only way to compute it accurately is to get all values from both shards, sort them and take the median. This would require lots of memory and of network bandwidth.
Fortunately there are algorithms that allow to compute good approximated values of percentiles with limited memory usage, and we are in particular looking into tdigest so it is quite likely that (approximate) percentiles will be supported in a future release of Elasticsearch.