1
votes

we use statsd as aggregator that forwards to graphite after 60secs.

i can see graphite filling the "stats.timers" buckets. but not all of the expected ones.

on the graphite machine:

graphite:/opt/graphite # find .../xxx/desktopapp/members/contacting -name "*.wsp"
.../xxx/desktopapp/members/contacting/lastVisitors/mean_90.wsp
.../xxx/desktopapp/members/contacting/lastVisitors/sum.wsp
.../xxx/desktopapp/members/contacting/lastVisitors/std.wsp
.../xxx/desktopapp/members/contacting/welcome/count_ps.wsp
.../xxx/desktopapp/members/contacting/feditWelcome/mean.wsp
.../xxx/desktopapp/members/contacting/contacts/count.wsp
.../xxx/desktopapp/members/contacting/contacts/sum_90.wsp
.../xxx/desktopapp/members/contacting/preContact/count_ps.wsp
.../xxx/desktopapp/members/contacting/preContact/mean_90.wsp
.../xxx/desktopapp/members/contacting/preContact/sum.wsp
.../xxx/desktopapp/members/contacting/preContact/std.wsp
.../xxx/desktopapp/members/contacting/preContact/count.wsp
.../xxx/desktopapp/members/contacting/preContact/sum_90.wsp
.../xxx/desktopapp/members/contacting/fedit/upper.wsp
.../xxx/desktopapp/members/contacting/preWelcome/count_ps.wsp
.../xxx/desktopapp/members/contacting/preWelcome/sum.wsp
.../xxx/desktopapp/members/contacting/preWelcome/std.wsp
.../xxx/desktopapp/members/contacting/contact/count_ps.wsp
.../xxx/desktopapp/members/contacting/contact/sum.wsp
.../xxx/desktopapp/members/contacting/contact/std.wsp
.../xxx/desktopapp/members/contacting/favorite/median.wsp

looking at the statsd source code (https://github.com/etsy/statsd/blob/master/lib/process_metrics.js) i would expect the follwoing metrics to appear (each as own bucket) for each thing i time.

source:

    current_timer_data["std"] = stddev;
    current_timer_data["upper"] = max;
    current_timer_data["lower"] = min;
    current_timer_data["count"] = timer_counters[key];
    current_timer_data["count_ps"] = timer_counters[key] / (flushInterval / 1000);
    current_timer_data["sum"] = sum;
    current_timer_data["mean"] = mean;
    current_timer_data["median"] = median;

anybody any idea why for some i only get "count_ps" and for others i get "upper". does it take some time for graphite to process its internal statistics queue(s)?

statsd log says roughly 500 numstats / min are sent:

13 Mar 10:13:53 - DEBUG: numStats: 498
13 Mar 10:14:53 - DEBUG: numStats: 506
13 Mar 10:15:53 - DEBUG: numStats: 491
13 Mar 10:16:53 - DEBUG: numStats: 500
13 Mar 10:17:53 - DEBUG: numStats: 488
13 Mar 10:18:53 - DEBUG: numStats: 482
13 Mar 10:19:53 - DEBUG: numStats: 486

any help highly appreciated

cheers marcel

2

2 Answers

1
votes

@marcel , did you configure percentThreshold: in local.js of your statsd? and to get "upper" metrics , first you need to ensure how metrics are being sent to statsd. eg. To utilize timers buckets of statsd you need to send metrics with specifying it's type like

specifying the type of metrics as timers as :

echo "xx.yy.zz:<data point>|t"

0
votes

I have seen that for sparse data sets it seems to take a while for all the stats to show up in graphite. I don't know the exact threshold, but it does seem from my experience to require a certain amount of data to be pushed into graphite for a metric before it shows all the different timer stats.