0
votes

I am currently practicing SAS programming on using two SAS dataset(sample and master) . Below are the hypothetical or dummy data created for illustration purpose to solve my problem through SAS programming . I would like to extract the data for the id's in sample dataset from master dataset(test). I have given an example with few id's as sample dataset, for which i need to extract next 12 month information from master table(test) for each id's based on the yearmonth information( desired output given in the third output).

Below is the code to extract the previous 12 month data but i am not getting idea to extract next 12 month records as pulled for previous months, Can anyone help me in solving this problem using SAS programming with optimized way.

proc sort data=test;
by id yearmonth;
run;

data result;
set test;
array prev_month {13} PREV_MONTH_0-PREV_MONTH_12;
by id;
if first.id then do;
do i =1 to 13;
prev_month(i)=0;
end;
end;
do i = 13 to 2 by -1;
prev_month(i)=prev_month(i-1);
end;
prev_month(1)=no_of_cust;
drop i prev_month_0;
retain prev_month:;
run;

data sample1;
set sample(drop=no_of_cust);
run;

proc sort data=sample1;
by id yearmonth;
run;

data all;
merge sample1(in=a) result(in=b);
by id yearmonth;
if a;
run;

One sample dataset (dataset name - sample).

ID  YEARMONTH   NO_OF_CUST
1    200909        50
1    201005        65
1    201008        78
1    201106        95
2    200901        65
2    200902        45
2    200903        69
2    201005        14
2    201006        26
2    201007        98

One master dataset - dataset name (test) (huge dataset over the year for each id from start of the account to till date.)

ID  YEARMONTH   NO_OF_CUST
1   200808        125
1   200809        125
1   200810        111
1   200811        174
1   200812        98
1   200901        45
1   200902        74
1   200903        73
1   200904        101
1   200905        164
1   200906        104
1   200907        22
1   200908        35
1   200909        50
1   200910        77
1   200911        86
1   200912        95
1   201001        95
1   201002        87
1   201003        79
1   201004        71
1   201005        65
1   201006        66
1   201007        66
1   201008        78
1   201009        88
1   201010        54
1   201011        45
1   201012        100
1   201101        136
1   201102        111
1   201103        17
1   201104        77
1   201105        111
1   201106        95
1   201107        79
1   201108        777
1   201109        758
1   201110        32
1   201111        15
1   201112        22
2   200711        150
2   200712        150
2   200801        44
2   200802        385
2   200803        65
2   200804        66
2   200805        200
2   200806        333
2   200807        285
2   200808        265
2   200809        222
2   200810        220
2   200811        205
2   200812        185
2   200901        65
2   200902        45
2   200903        69
2   200904        546
2   200905        21
2   200906        256
2   200907        214
2   200908        14
2   200909        44
2   200910        65
2   200911        88
2   200912        79
2   201001        65
2   201002        45
2   201003        69
2   201004        54
2   201005        14
2   201006        26
2   201007        98

Desired Output should like below,

ID  YEARMONTH   NO_OF_CUST  AFTER_MONTH_1   AFTER_MONTH_2   AFTER_MONTH_3   AFTER_MONTH_4   AFTER_MONTH_5   AFTER_MONTH_6   AFTER_MONTH_7   AFTER_MONTH_8   AFTER_MONTH_9   AFTER_MONTH_10  AFTER_MONTH_11  AFTER_MONTH_12
1     200909        50         77              86                  95          95              87                79                71               65           66           66                78                88
1
Did you try sorting by descending YEARMONTH and then using the same algorithm?Tom

1 Answers

0
votes

Step1: Join your sample table with the main(test) table and using intnx to get all the values for next 12 months.
Step2: Making a column names "after month"
Step3: Transpose to get your final output

proc sql;
create table  abc as
select a.id,a.yearmonth,b.yearmonth as yearmonth1, b.no_of_cust
from 
sample a
left join
test b
on a.id = b.id  and a.yearmonth <= b.yearmonth <= intnx("month",a.yearmonth,12)
order by a.id,a.yearmonth,b.yearmonth;
quit;

data abc1(drop=col yearmonth1);
set abc;
by id yearmonth;
if first.yearmonth then col=-1;
col+1;
columns = compress("after_month_"||col);
run;

proc transpose data=abc1 out=abc2(rename=(after_month_0 = no_of_cust) drop=_name_);
by id yearmonth;
id columns;
var no_of_cust;
run;

My Output:
enter image description here

Or
If you want to make changes in your query then you could use the below code.

proc sort data=test;
by id descending yearmonth;
run;

data result;
    set test;
    array after_month {13} after_MONTH_0-after_MONTH_12;
    by id;

    if first.id then do;
        do i = 1 to 13;
            after_month(i) = 0;
        end;
    end;

    do i = 13 to 2 by -1;
        after_month(i) = after_month(i-1);
    end;
    after_month(1) = NO_OF_CUST;

    drop i after_MONTH_0;
    retain after_MONTH:;
run;  

data sample1;
set sample(drop=no_of_cust);
run;

proc sort data=result;
by id yearmonth;
run;
proc sort data=sample1;
by id  yearmonth;
run;

data all;
merge sample1(in=a) result(in=b);
by id yearmonth;
if a;
run;

Let me know in case of any queries.