I am trying to calculate a column that performs a running count of consecutive null values, but the running count will reset upon non-null values.
I am currently trying to achieve this on this version of redshift:
PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.8187
I tried using this window function but this is just continuously incrementing the number for each null.
ROW_NUMBER() OVER (PARTITION BY ID, VAL ORDER BY VAL ROWS UNBOUNDED PRECEDING)
For example if I had a dataset like this:
id | date | val
----+-------+-------
1 | 1/1 | NULL
1 | 1/2 | NULL
1 | 1/3 | NULL
1 | 1/4 | 1
1 | 1/5 | NULL
1 | 1/6 | NULL
1 | 1/7 | 1
2 | 1/8 | 2
2 | 1/9 | NULL
2 | 1/1 | NULL
2 | 1/2 | 1
2 | 1/3 | NULL
2 | 1/4 | 0
2 | 1/5 | NULL
2 | 1/6 | NULL
I would like the output to look like this:
id | date | val | foo
----+-------+-------+-------
1 | 1/1 | NULL | 1
1 | 1/2 | NULL | 2
1 | 1/3 | NULL | 3
1 | 1/4 | 1 |
1 | 1/5 | NULL | 1
1 | 1/6 | NULL | 2
1 | 1/7 | 1 |
2 | 1/8 | 2 |
2 | 1/9 | NULL | 1
2 | 1/1 | NULL | 2
2 | 1/2 | 1 |
2 | 1/3 | NULL | 1
2 | 1/4 | 0 |
2 | 1/5 | NULL | 1
2 | 1/6 | NULL | 2
sum(case when val = 'NULL' then 1 else 0 end) over (partition by id,val order by id,val,date rows between unbounded preceding and current row) as foo
– demircioglu