I'm working on a IoT project. I've a Raspberry pi which sends smart meter data to an IoT eventhub on Azure. I read the data using an Azure Stream Analytics Job. Meter readings are sent every 10 seconds.
One of the queries I created calculates the electricity consumption per hour using a tumbling window. It calculates the max meterreading from this window minus the max meterreading of the previous window using lag. I am using the timestamp from the IOT device instead of the default arrival time.
WITH onehourwindow AS
(
SELECT
max(total_low) * 1000 as max_low,
max(total_high) * 1000 as max_high,
max(gas) * 1000 as max_gas,
round(avg(current_consumption), 1) as avg_consumption,
max(timestamp) as max_timestamp
FROM
eventhuninputsmartmeter TIMESTAMP BY timestamp
GROUP BY TUMBLINGWINDOW(hour, 1)
)
SELECT
(max_low - LAG(max_low) OVER (LIMIT DURATION(hour, 1))) / 1000 as total_consumption_low,
(max_high - LAG(max_high) OVER (LIMIT DURATION(hour, 1))) / 1000 as total_consumption_high,
(max_gas - LAG(max_gas) OVER (LIMIT DURATION(hour, 1))) / 1000 as total_consumption_gas,
avg_consumption,
max_timestamp
INTO
MeterReadingSQLDB
FROM
onehourwindow
The query returns the expected results in test. Here is an example of the timestamp and consumption in the test results. As expected the latest timestamp (max) is the last meter reading of the hour at 59 minutes and 50 or so seconds.
|----------------------------|---------------------------|
| max_timestamp | total_consumption_high |
|----------------------------|---------------------------|
|2020-02-28T22:59:52.1794730 | 1.171 |
|2020-02-28T21:59:51.6680430 | 0.500 |
|----------------------------|---------------------------|
When I run the query job I get the following results written to my SQL DB. Now the latest timestamp (max) is not the last meter reading of the (clock) hour, but instead it is at 54 minutes. If I do the consumption calculation by hand I can see that the window used is one hour, it just doesn't start at 00 but at 55 minutes every hour.
|----------------------------|---------------------------|
| max_timestamp | total_consumption_high |
|----------------------------|---------------------------|
|2020-02-28T22:54:52.1300000 | 1.353 |
|2020-02-28T21:54:51.6830000 | 0.510 |
|----------------------------|---------------------------|
How to solve this? I have tried a lot of things but can't seem to fix it. The answer to the post below looked promising, but my events don't come in late, definately not 6 minutes late. Thus still using the default event ordering policies.
Azure Stream Analytics 'TimeStamp By' in query doesn't works on job but works fine on test
Any thoughts on fixing this such that I get the max timestamp of the window at 59 minutes and 50 seconds or so?
Thanks!
Thomas
