2
votes

I'm using Prometheus to scrape AWS cloudwatch metrics and send alerts when certain SQS queues spike up in their # of messages. Say my queue depth graph looks like this: Image of a line graph trending downwards.

I'd want an alarm when it spikes only upward. Currently, I'm using the expression increase(QueueDepthMetric[10m]), where QueueDepthMetric is the variable represented in the visual above. I anticipated that this would only show spikes where the metric increases, but it instead shows spikes where the metric's slope increases: Another line graph, showing the derivative values of the first graph.

This causes the alarm threshold to be reached on any spike, both positive and negative. After browsing the "Query Functions" page of the Prometheus documentation, I was unable to find the function that I'm looking for.

Is there a metric function or formula in Prometheus that will only show increases in a metric, rather than any net change?

Note that I'm not looking to determine a raw Queue Depth threshold; rather, I'm looking to determine when the number increases dramatically.

1

1 Answers

3
votes

increase is for counters, and queue depth is a gauge. Those spikes are actually where the value decreased as they were treated as counter resets.

What you want is deriv which will give you the slope over the given time period based on a simple linear regression.