3
votes

Without resorting to CTEs or a sub-query is there any way to use Window functionality with a different summary level than the GROUP BY? COUNT(*) works, but if a column name is specified in the COUNT or the SUM function is used, the query errors with "is not a valid group by expression". Even if the PARTITION BY columns are the same as the GROUP BY the error results.

The commented out lines will cause the query to fail. It's precisely for these types of things that one would want to use Window functionality in the first place.

create table sales (product_id integer, retail_price real, quantity integer, city varchar, state varchar);
insert into sales (product_id, retail_price, quantity, city, state) values 
(1, 2.00,  1, 'SF', 'CA'),
(1, 2.00,  2, 'SJ', 'CA'),
(2, 5.00,  4, 'SF', 'CA'),
(2, 5.00,  8, 'SJ', 'CA'),
(2, 5.00, 16, 'Miami', 'FL'),
(2, 5.00, 32, 'Orlando', 'FL'),
(2, 5.00, 64, 'SJ', 'PR');

select  city, state
,   count(*) as city_sale_cnt
,   count(*) over ( partition by state) as state_sale_cnt
--  ,   count(product_id) over ( partition by state) as state_sale_cnt2
,   sum(retail_price) as city_price
--  ,   sum(retail_price) over ( partition by state) as state_price

from sales
group by 1,2;

The docs indicate Window functionality might cause problems, including the vague warning "PARTITION BY is not always compatible with GROUP BY.": The error message SQL compilation error: ... is not a valid group by expression is often a sign that different columns in the SELECT statement’s “project” clauses are not partitioned the same way and therefore might produce different numbers of rows.

2

2 Answers

8
votes

The commented out code is not correct. The reason is that the window function is parsed "after" the group by, and there is no product_id or retail_price after the group by.

This is easily fixed:

select city, state,
       count(*) as city_sale_cnt,
       count(*) over (partition by state) as state_sale_cnt,
       sum(count(product_id)) over (partition by state) as ,
       sum(retail_price) as city_price,
       sum(sum(retail_price)) over ( partition by state) as state_price
from sales
group by 1, 2;

At first, using window functions in an aggregation query looks a bit confusing -- the nested aggregation functions look awkward. I find, though that it is pretty easy to get used to the syntax, once you have used it a few times.

3
votes

Although snowflake would possibly allow that (as demonstrated by Gordon Linoff), I would advocate for wrapping the aggregate query and using window functions in the outer query.

Few RDBMS allow mixing window functions and aggregation, and the resulting queries are usally hard to understand (unless you are an authentic SQL wizard like Gordon!).

select
    t.*,
    sum(city_sale_cnt) over (partition by state) as state_sale_cnt,
    sum(city_price) over ( partition by state) as state_price
from (
    select  
        city, 
        state,
        count(*) as city_sale_cnt,
        sum(retail_price) as city_price
    from sales
    group by 1,2
) t;