InfluxDB version used: 1.8.0
Given a time series db that is used for storing e.g. temperatures from iot sensors (on different locations).
The sensors are queried e.g. every other minute.
Now the maximum temperature per sensor for the last hour can be queried using
select max(*) from temperatures where time >= now() - 1h group by location
name: temperatures
tags: location=collector
time max_temperature
---- ---------------
2020-06-24T17:41:34Z 34.8
name: temperatures
tags: location=outside
time max_temperature
---- ---------------
2020-06-24T17:43:34Z 23.4
I'm now would like to keep the max temperatures for every hour and for every day for a certain period of time.
So naturally I would use a retention policy and continuous queries.
Lets say I want to store the the maximum temperature by the hour for a month:
show RETENTION POLICIES on iotsensors
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
lastmonth 744h0m0s 24h0m0s 1 false
The continuous query looks like this:
CREATE CONTINUOUS QUERY max_temperatures_per_hour ON iotsensors
BEGIN
SELECT max(temperature) INTO iotsensors.lastmonth.max_temperatures_per_hour FROM iotsensors.autogen.temperatures GROUP BY time(1h), location TZ('Europe/Berlin')
END
By the nature of the GROUP BY time(1h)
term, the exact time of the temperature is lost.
Especially when the data is condensed for a whole day in the second step FROM iotsensors.lastmonth.max_temperatures_per_hour GROUP BY time(1d)
the resolution is getting even more coarse. (setting it to midnight of each day 00:00:00
)
select max from iotmeasurements.last2years.max_temperatures_per_day where time >= now() - 4d group by location tz('Europe/Berlin')
name: max_temperatures_per_day
tags: location=collector
time max
---- ---
2020-06-21T00:00:00+02:00 80.9
2020-06-22T00:00:00+02:00 78.5
2020-06-23T00:00:00+02:00 101.2
name: min_max_temperatures_per_day
tags: location=outside
time max
---- ---
2020-06-21T00:00:00+02:00 21.8
2020-06-22T00:00:00+02:00 22.5
2020-06-23T00:00:00+02:00 22.8
I do know that this the expected and documented behaviour
https://docs.influxdata.com/influxdb/v1.8/query_language/explore-data/#group-by-time-intervals
However, the information of when exactly the maximum value was recorded is a valuable information which I'd like to keep.
Is there any way to store the exact timestamp of the record when downsampling?
I'd prefer to keep the timestamp inside the time field like
tags: location=collector
time max
---- ---
2020-06-20T04:30:40Z 80.9
2020-06-21T04:22:00Z 78.5
2020-06-22T04:53:10Z 101.2
Alternatively but a second best solution would be to add a timestamp field for each downsampled record
time max timestamp
---- --- ---------
2020-06-20T00:00:00+02:00 80.9 2020-06-20T04:30:40Z
2020-06-21T00:00:00+02:00 78.5 2020-06-21T04:22:00Z
2020-06-22T00:00:00+02:00 101.2 2020-06-22T04:53:10Z
For this I needed to be able to query the time into a separate field, wouldn't I.
But my attempts weren't successful so far. Something I tried was this:
SELECT max(temperature),time as timestamp FROM temperatures GROUP BY time(60m),"location"
I'd consider to move to InfluxDB 2.0 if that was a prerequesit for a solution to my problem.