0
votes

Rationale:

I'm looking at computing the covariance in Prometheus of two time series. The formula runs roughly as follows:

q[a,b] := sigma(i=1..n, a[i]-avg(a) * b[i]-avg(b))/n-1

Let's take a look at computing the time series of the a[i] - avg[a] term. I've selected the rate(apiserver_request_total[5m]) as the worked example, grouping on the verb term (note this is a relatively useless time series, I simply wanted to use live data):

rate(apiserver_request_total[5m]) 
  - on()  group_left(verb) 
avg by(verb) (rate(apiserver_request_total[5m]))

this fails, because the matching algorithm matches mismatching LIST and PUT (properly).

I can aggregate the left term to be:

sum(rate(apiserver_request_total[5m])) by (verb) 
 -  
avg by(verb) (rate(apiserver_request_total[5m]))

but this loses the detail of each server's behavior, and, anyway, it doesn't seem to properly conform to the meaning of the original equation.

What is the proper way to implement a[i] - avg[a] here?