0
votes

Following is my Raw data

DATE        Product_id Customer_ID
----------------------------
01-JUL-14       60      A
01-AUG-14       45      A
01-SEP-14       45      A
01-SEP-14       50      A
01-OCT-14       30      A
01-JUL-14       60      B
01-AUG-14       45      B
01-SEP-14       45      B
01-OCT-14       30      B

This is the desired result for my rolling counts

MMYY     Distinct   Customer
         Product
-------------------------
JUL-14      1       A
AUG-14      2       A
SEP-14      3       A
OCT-14      3       A
JUL-14      1       B
AUG-14      2       B
SEP-14      2       B
OCT-14      2       B

The way I need it to work is that for each MMYY, I need to look back 3 months, and COUNT DISTINCT Products per CUSTOMER Products can be repeat. And also a customer can have more than 1 product in same month.

Normally I would write the query like so

SELECT
  customer_ID, 
  T.Date as MMYY,
  COUNT(DISTINCT Product_id)
    OVER (PARTITION BY customer_ID ORDER BY T.Date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
    AS Last_3_month_count
    FROM T

But the problem is that we cannot use COUNT(DISTINCT Product_id) its gives an error. Suggest me other way to do solve this problem.

2

2 Answers

0
votes

Update Answer

You can run a query as follows

    select to_char(b.date,'MM-YY') as mm_yy
          ,count(distinct b.product_id)
          ,b.customer_id
      from raw_data a
      join raw_data b
        on a.customer_id=b.customer_id
       and b.date>=add_months(a.date,-2)
       and b.date<=a.date
  group by to_char(b.date,'MM-YY')
          ,b.customer_id

If vertica allows select in a select then this can be made use of

select to_char(b.date,'MM-YY') as mm_yy
      ,(select count(distinct product_id)
          from raw_data a
         where a.customer_id=b.customer_id
           and a.date>=add_months(b.date,-3)
           and a.date<b.date) as cnt_distinct_pid
       ,b.customer
  from raw_data b
0
votes

I find my solution This gives me exactly same result set as i wanted.

select to_char(a.date,'MM-YY') as mm_yy
          ,count(distinct b.product_id)
          ,b.customer_id
      from raw_data a
      join raw_data b
        on a.customer_id=b.customer_id
       and b.date between(add_months(a.date,-2) and a.date)
  group by to_char(a.date,'MM-YY')
          ,b.customer_id