0
votes

I'm trying to measure a online mini-batch processing system with a per-second metrics (total query per second). For every batch, a metric (e.g. "stats.gauges.<host>.query.count") will be send to graphite. batches are processed in several different hosts in parallel and a batch of data take about 5 seconds to process. I've tried:

  1. simply sum series: sumSeries(stats.gauges.*.query.count), the result metrics is many times greater than the actual value;
  2. scale to 1 second: scaleToSeconds(sumSeries(stats.gauges.*.query.count), 1), the result metrics is much less than the actual value;
  3. integral then derivative: nonNegativeDerivative(sumSeries(integral(stats.gauges.*.query.count))), same as the first case ...
  4. send gauges with
    delta=True param, then derivative. the result is about 20% greater than the actual value

so, how to get per-second metrics from batch metrics? what is the best practice?

1

1 Answers

0
votes

You should use carbon-aggregator service to add several metrics together as they come in. There is an example which fits your case at http://graphite.readthedocs.io/en/latest/config-carbon.html#aggregation-rules-conf

As your batch takes 5 secs to process, frequency should be 5 to buffer all the metrics. After five seconds, aggregator will sum them up and write to graphite.