1
votes

(See updated, clearer explanation below: go to "Update on 2019-09-20")

I am looking for a way for Grafana to query my Prometheus data source with variable labels which are YYYY-MM-DD dates, and moreover, a date relative to current day.

I want to see the last 4 days, I could create a Grafana graph with 4 queries with correct labels as follows, and it would work, but I would need to update my graph everyday:

  • myapp_metric_foo{task_date="2019-09-16"}
  • myapp_metric_foo{task_date="2019-09-17"}
  • myapp_metric_foo{task_date="2019-09-18"}
  • myapp_metric_foo{task_date="2019-09-19"}

To avoid that, I am looking for some date computation formula like: now - 1 day | format_date "YYYY-MM-DD"

So my Grafana graph queries example would be:

  • myapp_metric_foo{task_date="{{ now | format_date "YYYY-MM-DD" }}"}
  • myapp_metric_foo{task_date="{{ now - 1 day | format_date "YYYY-MM-DD" }}"}
  • myapp_metric_foo{task_date="{{ now - 2 day | format_date "YYYY-MM-DD" }}"}
  • myapp_metric_foo{task_date="{{ now - 3 day | format_date "YYYY-MM-DD" }}"}

I couldn't find anything in Grafana that seems to allow such thing.

Another idea would be to update existing graph from a external script via Grafana API...


Update on 2019-09-20:

It looks like I need to give more explanation about the application.

Application context

The instrumented application myapp runs tasks/jobs (let's say it's calculation jobs that can take some time). Each task has a task_date (id. when it was submitted. Set at task creation: it will never change) and can be in one of the 3 following state/status:

  • new
  • running
  • done

When Prometheus scrapes myapp, myapp tells Prometheus how many tasks:

  • are in state new, grouped by task_date
  • are in state running, grouped by task_date
  • are in state done, grouped by task_date

The application deletes done tasks older than 7 days.

Application data

Let's say the application has, at 2019-09-19 14h00, the following tasks in it's database:

+----+------------+---------+---+
| ID | task_date  | status  | … |
+----+------------+---------+---+
| 42 | 2019-09-12 | done    | … |
| 43 | 2019-09-12 | done    | … |
| 44 | 2019-09-12 | done    | … |
| 45 | 2019-09-13 | done    | … |
| 46 | 2019-09-15 | done    | … |
| 47 | 2019-09-15 | done    | … |
| 48 | 2019-09-16 | done    | … |
| 49 | 2019-09-17 | running | … |
| 50 | 2019-09-17 | done    | … |
| 51 | 2019-09-17 | done    | … |
| 52 | 2019-09-18 | new     | … |
| 53 | 2019-09-18 | running | … |
| 54 | 2019-09-18 | running | … |
| 55 | 2019-09-18 | done    | … |
| 56 | 2019-09-18 | done    | … |
| 57 | 2019-09-19 | new     | … |
| 58 | 2019-09-19 | new     | … |
| 59 | 2019-09-19 | running | … |
+----+------------+---------+---+

The metrics exposed to Prometheus by myapp, at 2019-09-19 18h00 would be (text-based format):

myapp_tasks_total{task_date="2019-09-12",status="done"} 3
myapp_tasks_total{task_date="2019-09-13",status="done"} 1
myapp_tasks_total{task_date="2019-09-15",status="done"} 2
myapp_tasks_total{task_date="2019-09-16",status="done"} 1
myapp_tasks_total{task_date="2019-09-17",status="running"} 1
myapp_tasks_total{task_date="2019-09-17",status="done"} 2
myapp_tasks_total{task_date="2019-09-18",status="new"} 1
myapp_tasks_total{task_date="2019-09-18",status="running"} 2
myapp_tasks_total{task_date="2019-09-18",status="done"} 2
myapp_tasks_total{task_date="2019-09-19",status="new"} 2
myapp_tasks_total{task_date="2019-09-19",status="running"} 1

Let's suppose the following occurs on the application afterwards:

  • A task submitted at day 2019-09-18 starts (moves from new to running)
  • A task submitted at day 2019-09-19 finishes (moves from running to done)
  • Tasks having a task_date older than 7 days are deleted (here tasks for 2019-09-12)
  • A new task is submitted at 2019-09-20 00h43m

A few hours later, at 2019-09-20 02h00, the new exposed metrics ouput would be:

myapp_tasks_total{task_date="2019-09-12",status="done"} 3
myapp_tasks_total{task_date="2019-09-13",status="done"} 1
myapp_tasks_total{task_date="2019-09-15",status="done"} 2
myapp_tasks_total{task_date="2019-09-16",status="done"} 1
myapp_tasks_total{task_date="2019-09-17",status="running"} 1
myapp_tasks_total{task_date="2019-09-17",status="done"} 2
myapp_tasks_total{task_date="2019-09-18",status="running"} 3
myapp_tasks_total{task_date="2019-09-18",status="done"} 2
myapp_tasks_total{task_date="2019-09-19",status="new"} 2
myapp_tasks_total{task_date="2019-09-19",status="done"} 1
myapp_tasks_total{task_date="2019-09-20",status="done"} 1

My Grafana graph (visualization type=Graph) would use the following 4 PromQL queries (4, because I only want to see the last 4 days):

  • Query A
    • Metrics: myapp_tasks_total{task_date="2019-09-17"}
    • Legend: {{status}} tasks submitted 3 days ago
  • Query B
    • Metrics: myapp_tasks_total{task_date="2019-09-18"}
    • Legend: {{status}} tasks submitted 2 days ago
  • Query C
    • Metrics: myapp_tasks_total{task_date="2019-09-19"}
    • Legend: {{status}} tasks submitted yesterday
  • Query D
    • Metrics: myapp_tasks_total{task_date="2019-09-20"}
    • Legend: {{status}} tasks submitted today

This would produces 4*3=12 curves max (depending on how many distinct statuses exists for each days) that would help me keep tracks of application load (number of tasks) and speed (time-to-done).

The question

The Prometheus instrumenting part is not a problem for me: I know how to get my data from my database and how to expose it to Prometheus.

My issue is for the PromQL queries Grafana needs: The 4 queries I gave above will only be pertinent when accessing Grafana the 2019-09-20: I need a way to dynamically "change" the task_date= criteria in each query.

I was hoping Grafana had a custom DSL that would allow me to tell it:

Hey, take your $__to variable, removes x days and format it to "YYYY-MM-DD".

Something like: {{ $__to - x * 86400000 | format_date "YYYY-MM-DD" }}

(1 day = 86400000 ms)

Another idea would be to manually create the graph and periodically update it from a external script via Grafana API...

1
Based on your update, there are not enough information for making what you want. There is a similar usage in k8s with cron triggering task with labels that are specific to the start time of the task. Based on that experience, I would say you are missing a gauge metric giving you the start time (in epoch) of the task. - Michael Doubez
@CDuv did you figure this out? - fmakawa
No, I did not :-( - CDuv

1 Answers

0
votes

Since they have different label values (`task_date``) they are considered different metrics. You have to remove the dimension (the label).

Remove the label at request time

Replace the label with an empty value. This supposes it doesn't create a duplicated metric.

label_replace(up{job="myapp_metric_foo, "task_date", "", "task_date", ".*")

Or aggregate the metrics:

max(myapp_metric_foo) WITHOUT(task_date)

Remove the label at ingestion time

In your Prometheus, configuration, you can use metric relabeling to drop the label

metric_relabel_configs:
- regex: 'container_label_com_amazonaws_ecs_task_arn'
  action: labeldrop

In my opinion, you'd better drop it at ingest time. Having a label different per day doesn't really make sense ; unless it is some kind of very long scheduled job.