Calculate alarm flood in snowflake

Question

I am trying to do an alarm flood calculation in snowflake. I created the below dataset using a snowflake window function. So if the value is greater or equal to 3, then the alarm flood will start and for the next 0 value, it will end. So in the below example, the alarm flood started at "9:51' and ended at "9:54" which was for 3 minutes. The next flood started at "9:57" and ended at "10:02" which is for 5 minutes.FYI, value at 9:59 is 3, but as a flood is already started, we don't have to consider it. The next flood is at 10:03 but there is no 0 value, so we have to consider edge value 10:06. So total time in flood is 3+5+4= 12 minutes.

   DateTime    Value
3/10/2020 9:50  1
3/10/2020 9:51  3
3/10/2020 9:52  1
3/10/2020 9:53  2
3/10/2020 9:54  0
3/10/2020 9:55  0
3/10/2020 9:56  1
3/10/2020 9:57  3
3/10/2020 9:58  2
3/10/2020 9:59  3
3/10/2020 10:00 2
3/10/2020 10:01 2
3/10/2020 10:02 0
3/10/2020 10:03 3
3/10/2020 10:04 1
3/10/2020 10:05 1
3/10/2020 10:06 1

so, in short, I am expecting below output

I tried below SQL but it does not give me the correct output, it fails in second flood time (as there again value 3 before next 0)

select t.*,
       (case when value >= 3
             then datediff(minute,
                           datetime,
                           min(case when value = 0 then datetime end) over (order by datetime desc)
                          )
        end) as diff_minutes
from t;

What is it you are looking for exactly? Are you struggling with a SQL statement that will accomplish this? If so, what SQL have you tried? You might also want to add a generic SQL tag to this, since I don't think the solution will be Snowflake-specific. — Mike Walton

David Garrison David Garrison · Accepted Answer · 2020-03-24T18:58:30

I'm not the most proud of this code, but it works and gives a starting place. I'm sure it can be cleaned up or simplified. and I haven't assessed performance for larger tables.

The key insight that I used is that if you add the date_diff to the date, then you can find situations where they both add to the same value, meaning that they are both counting to the same "0" record. Hopefully this concept is helpful if nothing else.

Also, the first cte is a semi-hacky way to get that 4 at the end of your results.

--Add a fake zero at the end of the table to provide a value for
-- comparing high values that have not been resolved
-- added a flag so this fake value can be removed later
with fakezero as
(
SELECT datetime, value, 1 flag
FROM test

UNION ALL

SELECT dateadd(minute, 1, max(datetime)) datetime, 0 value, 0 flag
FROM test  
)

-- Find date diffs between high values and subsequent low values
,diffs as (
select t.*,
       (case when value >= 3
             then datediff(minute,
                           datetime,
                           min(case when value = 0 then datetime end) over (order by datetime desc)
                          )
        end) as diff_minutes
from fakezero t
)

--Fix cases where two High values are "resolved" by the same low value
--i.e. when adding the date_diff to the datetime results in the same timestamp
-- this means that the prior high value record that still hasn't been "resolved"
select
  datetime
  ,value
  ,case when 
      lag(dateadd(minute, diff_minutes, datetime)) over(partition by value order by datetime)
      = dateadd(minute, diff_minutes, datetime)
    then null 
    else diff_minutes 
  end as diff_minutes
from diffs
where flag = 1
order by datetime;

Calculate alarm flood in snowflake

3 Answers