I have a common table expression with a window function and keep getting an error message:
Error while compiling statement: FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. Underlying error: org.apache.hadoop.hive.ql.parse.SemanticException: Line 82:6 Invalid column reference 'gcr_amt' in definition of CTE pro_orders [ select o.shopper_id as pro_shopper_id, date_format(o.order_date, 'YYYYMM') as ym_order, sum(o.gcr_amt) as total_gcr, sum(case when o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt end) as new_gcr, sum(o.gcr_amt) over (partition by o.shopper_id rows between 12 preceding and 1 following) as 12months_direct_gcr from dp_enterprise.uds_order o inner join combined_shopper_level_data cs on cs.pro_shopper_id = o.shopper_id and cs.year_month = date_format(o.order_date, 'YYYYMM') where o.exclude_reason_desc is Null group by o.shopper_id, o.order_date ] used as po at Line 83:5
My cte looks like this:
pro_orders as (
select o.shopper_id as pro_shopper_id,
date_format(o.order_date, 'YYYYMM') as ym_order,
sum(o.gcr_amt) as total_gcr,
sum(case when o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt end) as new_gcr,
sum(o.gcr_amt) over (partition by o.shopper_id, cs.year_month order by cs.year_month desc rows between 12 preceding and 0 following) as 12months_direct_gcr
from dp_enterprise.uds_order o
right join combined_shopper_level_data cs on cs.pro_shopper_id = o.shopper_id and cs.year_month = date_format(o.order_date, 'YYYYMM')
group by o.shopper_id, o.order_date
),
I don't use window functions that often and perhaps my syntax is off. In english what I'm trying to do is get the 12 months total for metric "gcr".
So a row with shopper_id 123abc in year_month 201901, I'd like to sum the preceding 11 months plus the current rows month of gcr for a total of 12 months. Not sure if my window function is correctly set up for that or not?
The format of year_month being referenced is YYYYMM e.g. 201901.
Is my window function set up right given my goal?
How can I overcome this error message?
Edit: Still receiving this error message with the following CTE:
pro_orders as (
select o.shopper_id as pro_shopper_id,
cs.year_month,
sum(case when date_format(o.order_date, 'YYYYMM') = cs.year_month then o.gcr_amt else 0 end) as total_gcr,
sum(case when date_format(o.order_date, 'YYYYMM') = cs.year_month and o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt else 0 end) as new_gcr,
sum(sum(o.gcr_amt)) over (partition by o.shopper_id
order by cs.year_month desc
rows between 12 preceding and 0 following)
as 12months_direct_gcr
from combined_shopper_level_data cs
left join dp_enterprise.uds_order o on o.shopper_id = cs.pro_shopper_id
where o.exclude_reason_desc is Null
group by o.shopper_id, cs.year_month
),
Results in similar error message:
Error while compiling statement: FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. Underlying error: org.apache.hadoop.hive.ql.parse.SemanticException: Line 83:10 Invalid column reference 'gcr_amt' in definition of CTE pro_orders [ select o.shopper_id as pro_shopper_id, cs.year_month, sum(case when date_format(o.order_date, 'YYYYMM') = cs.year_month then o.gcr_amt else 0 end) as total_gcr, sum(case when date_format(o.order_date, 'YYYYMM') = cs.year_month and o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt else 0 end) as new_gcr, sum(sum(o.gcr_amt)) over (partition by o.shopper_id order by cs.year_month desc rows between 12 preceding and 0 following) as 12months_direct_gcr from combined_shopper_level_data cs left join dp_enterprise.uds_order o on o.shopper_id = cs.pro_shopper_id where o.exclude_reason_desc is Null group by o.shopper_id, cs.year_month ] used as po at Line 87:5