Could anyone suggest the best pattern of gathering metrics from a cluster of nodes (each node is Tomcat Docker Container with Java app)?
We're planning to use ELK stack (ElasticSearch, Logstash, Kibana) as a visualization tool but the question for us is how metrics should be delivered to Kibana?
We're using DropWizard metrics library and it provides per instance metrics (gauges, timers, histograms).
Some metrics, obviously, should be gathered per instance (e.g. cpu, memory, etc..) - it doesn't make any sense to aggregate them per cluster.
But for such metrics as an average API response times, database calls durations we want a clear global picture - i.e. not per concrete instance.
And here is where we hesitating. Should we:
- Just send plain gauge values to ElasticSearch and allow Kibana to calculate averages, percentiles, etc.. In this approach all aggregation happens in Kibana.
- Use timers and histograms per instance and send them instead - but since this data is already aggregated per instance (i.e. timer already provides percentiles and 1minute, 5minute and 15minute rates) - how should Kibana handle this to show a global picture? Does it make a lot of sense to aggregate already aggregated data?
Thanks in advance,