7
votes

I'm using statsd (the latest version from the git master branch) with graphite (0.9.10) as the backend.

In my (Django) code I call statsd.incr("signups") when a user signs up. In graphite's web interface, I now see a beautiful graph showing the number of signups per second under Graphite/stats/signups. When I look at the graph under Graphite/stats_counts/signups, I expect to see the total number of signups, but it looks like it's the number of signups per 10s interval (that's statsd's refresh interval, I guess).

I did configure storage-aggregation.conf, perhaps I got it wrong somehow? Also, I stopped carbon (not with stop, but really killed it, as apparently just stopping it doesn't allow it to reload the configuration). I also deleted the /opt/graphite/storage/whisper/stats_counts directory. Then I restarted the carbon daemon. I still get the number of signups per 10s interval. :-(

Here's my configuration:

# /opt/graphite/conf/storage-aggregation.conf

[lower]
pattern = \.lower$
xFilesFactor = 0.1
aggregationMethod = min

[upper]
pattern = \.upper$
xFilesFactor = 0.1
aggregationMethod = max

[upper_90]
pattern = \.upper_90$
xFilesFactor = 0.1
aggregationMethod = max

[count]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum

[count_ps]
pattern = \.count_ps$
xFilesFactor = 0
aggregationMethod = sum

[sum]
pattern = \.sum$
xFilesFactor = 0
aggregationMethod = sum

[sum_90]
pattern = \.sum_90$
xFilesFactor = 0
aggregationMethod = sum

[stats_counts]
pattern = ^stats_counts\.
xFilesFactor = 0
aggregationMethod = sum

[min]
pattern = \.min$
xFilesFactor = 0.1
aggregationMethod = min

[max]
pattern = \.max$
xFilesFactor = 0.1
aggregationMethod = max

[default_average]
pattern = .*
xFilesFactor = 0.5
aggregationMethod = average

And this:

# /opt/graphite/conf/storage-schemas.conf

[stats]
priority = 110
pattern = ^stats.*
retentions = 10s:6h,1m:7d,10m:1y

I'm starting to think that I did everything right and that Graphite is really doing what it's supposed to be doing. So the question is:

What's the proper way to configure statsd & graphite to draw the total number of signups since the beginning of time?

I guess I could change my Django code to count the total number of users, once in a while, and then use a gauge instead of an incr, but it feels like graphite should be able to just sum up whatever it receives, on the fly, not just when it aggregates data.

Edit:

Using Graphite's web interface, in the Graphite composer, I applied the integral function to the basic "signups per second" graph (in Graphite/stats/signups), and I got the desired graph (i.e. the total number of signups). Is this the appropriate way to get a cumulated graph? It's annoying because I need to select the full date range since the beginning of time, I cannot zoom into the graph, or else I just get the integral of the zoomed part. :-(

1

1 Answers

2
votes

Yes the integral() function is the correct way of doing this. Since StatsD is stateless in that regard (all collected data is reset/deleted after the flush to Graphite occurs) there is no way for it to be able to sum up all received data since a certain point.

From the Graphite documentation of the integral() function:

This will show the sum over time, sort of like a continuous addition function. Useful for finding totals or trends in metrics that are collected per minute.