0
votes

I've see other threads/posts (on github, stackoverflow) where people have requested capability in prometheus to be able to filter or mark metrics as stale/expired based on the metrics' timestamp (when last pushed to pushgateway). It seems that this goes against the prometheus way of working and that is fine. However, I want to know how people have worked around this.

I've been trying out few things but unfortunately haven't had success:

  • Added a label in the metric that includes the epoch time. Use this label value to filter the metrics (or update the value of the metric to some status that indicates the metric as stale) to indicate.
    • I found that this results in the label value being in string but haven't been able to convert this to integer to do comparisons based on current times (like do something like "time() - timestamp) > 3600").
  • Use the "push_time_seconds" metric of the job to be able to identify the timestamp that was last pushed and filter or mark the data as stale. I see this metric automatically added in prometheus whenever user pushes the data. For example, if I were to push the following data:

cat <http://localhost:9091/metrics/job/test push_time_seconds{instance="",label1="value1",label2="value2"} 52 EOF

I see the the following metric in pushgateway metrics:

push_time_seconds{instance="",job="test"} 1.5754837280426762e+09 some_metric{instance="",job="test",label1="value1",label2="value2"} 5

However, I don't know how to build a PromQL query that would use push_time_seconds metric to update the value in some_metric. Like if the push_time_seconds is older than an hour, set value for some_metric to 0.

Anyone have advice on this?

1

1 Answers

0
votes

I found another db thats based on PromQL called "Victoria Metrics". I was able to use the boolean and "if" operators to manipulate the push_last_seconds and my queries to do what I want.

I ended up using in two approaches:

  • script/batch job -> pushgateway <- prometheus -> VictoriaMetrics <- Grafana (by using the VictoriaMetrics as a Prometheus type datasource)
    • This would use the boolean logic I specified below.
  • script/batch job -> VictoriaMetrics <- Grafana (by using the VictoriaMetrics as a Prometheus type datasource)
    • This gets rid of the need to use pushgateway altogether.

Let me know if anyone would like more info.

Another query (obviously the 'job' attribute is a key here):

avg(SomeMetric{job="some_job"}) if (time() - push_time_seconds{job="some_job"} < 30)

An example to do comparison/boolean:

WITH (x = avg(SomeMetric{job="some_job"}), y = (NaN if 3 < 2) default 2) (y default 3)

Combining both:

WITH (x = avg(SomeMetric{job="some_job"}), y = (NaN if (time() - push_time_seconds{job="some_job"} < 30)) default 2) (y default 3)