0
votes

I would like to write sas code (proc-sql). The question is how to create variable VP_Trans (in the table below). The table below is referring to the customers transactions within some period. I would like to calculate how many VP transactions the customer had based on his last 5 transactions. 5th transaction included e.g. Customer 1 had 5 VP transactions calculating from transactions happened between 1.2.-11.2.

+---------+-------------+-----------+----------+
| cust id | Trans Dates | TransType | Vp_Trans |
+---------+-------------+-----------+----------+
|       1 | 2015-02-01  | VP        | NA       |
|       1 | 2015-02-03  | Vp        | NA       |
|       1 | 2015-02-06  | VP        | NA       |
|       1 | 2015-02-06  | Vp        | NA       |
|       1 | 2015-02-11  | Vp        | 5        |
|       1 | 2015-02-13  | MG        | 4        |
|       1 | 2015-02-17  | Vp        | 4        |
|       1 | 2015-02-18  | mg        | 3        |
|       1 | 2015-02-19  | mg        | 2        |
|       1 | 2015-02-20  | mg        | 1        |
|       1 | 2015-02-21  | vp        | 2        |
|       2 | 2015-02-01  | VP        | NA       |
|       2 | 2015-02-03  | mg        | NA       |
|       2 | 2015-02-06  | mg        | NA       |
|       2 | 2015-02-06  | Vp        | NA       |
|       2 | 2015-02-11  | Vp        | 3        |
|       2 | 2015-02-13  | MG        | 2        |
|       2 | 2015-02-17  | Vp        | 3        |
|       2 | 2015-02-18  | mg        | 3        |
|       2 | 2015-02-19  | mg        | 2        |
|       2 | 2015-02-20  | mg        | 1        |
|       2 | 2015-02-21  | mg        | 1        |
|       2 | 2015-02-22  | mg        | 0        |
+---------+-------------+-----------+----------+
2
Getting the last five transactions is painful using SAS proc sql. Either use a data step or use a native query if you are using another database.Gordon Linoff
Data step would be great.Vendula

2 Answers

0
votes

Use a 5 element array to keep track of the checks for type='VP'. Use the MOD() function to implement wrap around indexing. Here is your example data:

data have ;
 input id Date Type $ Expected ;
 informat date yymmdd10.;
 format date yymmdd10.;
cards;
 1 2015-02-01 VP .
 1 2015-02-03 Vp .
 1 2015-02-06 VP .
 1 2015-02-06 Vp .
 1 2015-02-11 Vp 5
 1 2015-02-13 MG 4
 1 2015-02-17 Vp 4
 1 2015-02-18 mg 3
 1 2015-02-19 mg 2
 1 2015-02-20 mg 1
 1 2015-02-21 vp 2
 2 2015-02-01 VP .
 2 2015-02-03 mg .
 2 2015-02-06 mg .
 2 2015-02-06 Vp .
 2 2015-02-11 Vp 3
 2 2015-02-13 MG 2
 2 2015-02-17 Vp 3
 2 2015-02-18 mg 3
 2 2015-02-19 mg 2
 2 2015-02-20 mg 1
 2 2015-02-21 mg 1
 2 2015-02-22 mg 0
;;;;

Here is the data step to calculate VP_TRANS.

data want ;
   do n=1 by 1 until (last.id);
      set have ;
      by id date ;
      array flags (5) ;
      flags(mod(n,5)+1)= upcase(type)='VP';
      if n>= 5 then VP_trans=sum(of flags(*));
      output;
  end;
  drop flags: ;
run;
proc print; run;

enter image description here

0
votes

If you DBMS supports Windowed Aggregates, it's a simple Moving Count:

select
   cust_id, Trans_Dates, TransType,
   count(case when TransType = 'VP' then 1 end)
   over (partition bycust_id
         order by Trans_Dates
         rows 4 preceding) as Vp_Trans
from tab