16
votes

I'm using statsD to report counter data to graphite; sends a tick everytime I get a message. This works great, except in the situation when statsD has to restart for whatever reason. Then I get huge holes in my graphs, since statsD is now no longer sending '0' every 10 seconds for periods when I didn't get any messages.

I'm reporting for various different message types and queues, and sometimes I don't get a message for a particular queue for a long time.

Is there any existing way to 'fill-in' the missing data with a default value I specify (in my case this would be 0)?

I thought about sending a '0' count for a given metric so that statsD starts sending 0's for it, but I don't always know the set of metrics I'll be reporting in advance.

4
Turns out there is a function that will do exactly what I want. Although as @ALQ points out, it's important to know that it would affect aggregates; transformNull().BigBen

4 Answers

19
votes

Check out the function transformNull that Graphite provides. e.g.

transformNull(stats.timers.deploys.all.duration.total.mean, 0)

This will map sections with null data to 0.

12
votes

You can use the "keepLastValue(requestContext, seriesList)" function in graphite to deal with missing data. It "[c]ontinues the line with the last received value when gaps (‘None’ values) appear in your data, rather than breaking your line."

9
votes

If you just want to "fill in" the visual graph with zeros, look at "Graph Options -> Line Mode -> Draw Null as Zero". This won't let you set a value other than 0, and it won't cause 0's to show up if you get the data in json or csv format, but it's often what you want if you just want to see a graph with some stretches where no data gets recorded.

without Draw Null as Zero

With Draw Null as Zero

2
votes

The solution to this problem is not to keep the last value or transform nulls. Implementing one of those options will only cause you to display incorrect data, and you will not be alerted when something is wrong.

You need to change your storage schema so that it stores the amount of data that you're sending, and no more.

If metrics are being sent every 5s and your storage schema says 1s, you will get five data points, four of which will be null.

Check out this doc: https://github.com/etsy/statsd/blob/master/docs/graphite.md