0
votes

I am not familiar with SAS base and macro language syntax ,my codes keep going wrong..can someone offer a piece of SAS macro code of my pseudocode.

1.create a macro array to store all the distinct variable in table Map_num;

select distinct variable:into numVarList separated by ' ' from Map_num;

quit;

2.for loop the macro array numVarList and for loop each value of each element

(1)pick up the ith element

(2)for loop all the value of the ith element,

(3)if the value of the customer (from customerScore table)is within the scale of "start" and "end",then update score=score+woe*beta

for example:

the customerScore table is:

+--------+--------+---------+---------+----------+---------+---------+---------+---------+---------+---------+---------+-------+
| cst_id |   A    |    B    |    C    |    D     |    E    |    F    |    G    |    H    |    I    |    J    |    K    | score |
+--------+--------+---------+---------+----------+---------+---------+---------+---------+---------+---------+---------+-------+
|      1 | 688567 |     873 |  134878 |   546546 |    3123 |       6 |    5345 |  768678 |  348957 | -921839 |   -8217 |     0 |
|      2 |   3198 |   54667 | 9789867 | 53456756 |   78978 |    6456 |     645 |     534 |    -219 |   13312 |    4543 |     0 |
|      3 |  35324 | 6456568 |      43 |    56756 |   -8217 |  688567 |     873 |  134878 |      12 |   89173 |  213142 |     0 |
|      4 | 348957 | -921839 |   -8217 |     5345 |  434534 |    3198 |   54667 | 9789867 |   -8217 |   -8217 | 8908102 |     0 |
|      5 |   -219 |   13312 |    4543 |     4234 |   54667 |   35324 | 6456568 |      43 |  213142 |  213142 |     213 |     0 |
|      6 |     12 |   89173 |  213142 |    23234 |  348957 | -921839 |   -8217 |  688567 |     873 |  134878 |   23424 |     0 |
|      7 | 688567 |   89173 |  213142 |    -8217 |    -219 |   13312 |    4543 |    3198 |   54667 | 9789867 |    3434 |     0 |
|      8 |   3198 |   -8217 |   21313 |    -8217 |      12 |   89173 |  213142 |   35324 | 6456568 |      43 |    3123 |     0 |
|      9 |  35324 |   -8217 |  688567 |   688567 |     873 |  134878 |  688567 |     873 |  134878 |   -8217 |      11 |     0 |
|     10 | 348957 |   89173 |  213142 |     3198 |   54667 | 9789867 |    3198 |   54667 | 9789867 |   -8217 |    3198 |     0 |
|     11 |   -219 | -921839 |   -8217 |    35324 | 6456568 |      43 |   35324 | 6456568 |      43 | -921839 |   -8217 |     0 |
|     12 |     12 |   13312 |    4543 |    89173 |    4234 |    3198 |  688567 |     873 |  134878 |   13312 |    4543 |     0 |
|     13 |     12 |   89173 |  213142 |   348957 | -921839 |   -8217 |    3198 |   54667 | 9789867 |   89173 |  213142 |     0 |
|     14 |      2 |   89173 |  213142 |     -219 |   13312 |    4543 |   35324 | 6456568 |      43 |   54667 |    4543 |     0 |
|     15 | 348957 | -921839 |   -8217 |       12 |   89173 |  213142 |   13312 |    4543 |   89173 |    4234 |    4543 |     0 |
|     16 |   -219 |   13312 |   35324 |  6456568 |      43 |  213142 |   89173 |  213142 |  348957 | -921839 |   -8217 |     0 |
|     17 |     12 |   89173 | -921839 |    -8217 |  688567 |     873 |   89173 |  213142 |    -219 |   13312 |    4543 |     0 |
|     18 | 688567 |     873 |   13312 |     4543 |    3198 |   54667 | -921839 |   -8217 |      12 |   89173 |  213142 |     0 |
|     19 |   3198 |   54667 | 9789867 |   688567 |     873 |  134878 |      43 |  213142 |  213142 |     213 | 9789867 |     0 |
|     20 |  35324 | 6456568 |      43 |       43 |  213142 |  213142 |     213 |   89173 |    4234 |    3198 |  688567 |     0 |
+--------+--------+---------+---------+----------+---------+---------+---------+---------+---------+---------+---------+-------+

if table Map_num is below,then cst_id score is update:score=0+(-1.2)*3 + 2*3 + (0.1)*3 + 7*3

+----------+------------+------------+------+------+
| variable |   start    |    end     | woe  | beta |
+----------+------------+------------+------+------+
| A        | -999999999 |      57853 |   -1 |    3 |
| A        |      57853 |      89756 | -1.1 |    3 |
| A        |      89756 |     897452 | -1.2 |    3 |
| A        |     897452 | 9999999999 | -1.3 |    3 |
| B        | -999999999 |       4235 |    2 |    3 |
| B        |       4235 |      65785 |    3 |    3 |
| B        |      65785 | 9999999999 |    4 |    3 |
| C        | -999999999 |       9673 |  3.1 |    3 |
| C        |       9673 |      75341 |  2.1 |    3 |
| C        |      75341 |      98543 |  1.1 |    3 |
| C        |      98543 |     567864 |  0.1 |    3 |
| C        |     567864 | 9999999999 |   -1 |    3 |
| D        | -999999999 |       8376 |    5 |    3 |
| D        |       8376 |      93847 |    6 |    3 |
| D        |      93847 | 9999999999 |    7 |    3 |
+----------+------------+------------+------+------+

if table Map_num is below,then cst_id score is update:score=0+3*2 + 5*2 + 0*2 + 7*2 +3*2

+----------+------------+------------+-----+------+
| variable |   start    |    end     | woe | beta |
+----------+------------+------------+-----+------+
| E        | -999999999 |          3 |   1 |    2 |
| E        |          3 |     500000 |   3 |    2 |
| E        |     500000 |     800000 |   2 |    2 |
| E        |     800000 | 9999999999 |   4 |    2 |
| A        | -999999999 |       6700 |   6 |    2 |
| A        |     590000 |     680000 |   4 |    2 |
| A        |     680000 | 9999999999 |   5 |    2 |
| C        | -999999999 |      89678 |   9 |    2 |
| C        |      89678 |     566757 |   0 |    2 |
| C        |     566757 |     986785 | 2.8 |    2 |
| C        |     986785 | 9999999999 | 1.1 |    2 |
| K        | -999999999 |       7865 |   7 |    2 |
| K        |       7865 |      25637 |   9 |    2 |
| K        |      25637 |      65742 |   8 |    2 |
| K        |      65742 | 9999999999 | 0.2 |    2 |
| B        | -999999999 |      56753 |   3 |    2 |
| B        |      56753 |    5465624 |   4 |    2 |
| B        |    5465624 | 9999999999 |   1 |    2 |
+----------+------------+------------+-----+------+

thanks in advance!

table customerScore and Map_num are changing everyday for each rows and their column name:variable,start,end,woe,beta are not changed.I need to update the column score in table customerScore and the score is according to table Map_num.If the column A value in table customerScore is 688567 ,so it is 89756 <688567<897452,so the socre will be update:score=score+(-1.2 )* 3...is that clear for you?! it is a nested loop using SAS macro as I comprehended.

2
It is not at all clear what you are asking. Can you show an actual example calculation of SCORE? The formulas you have written do not involve any variables, so they are just complicated ways to write a constant.Tom
@Tom table customerScore and Map_num are changing everyday for each rows and their column name:variable,start,end,woe,beta are not changed.I need to update the column score in table customerScore and the score is according to table Map_num.If the column A value in table customerScore is 688567 ,so it is 89756 <688567<897452,so the socre will be update:score=score+(-1.2 )* 3...is that clear for you?!J.Cooker
So if the value of A is between START and END then you want to multiply WOE times BETA? Did you try just doing that?Tom

2 Answers

0
votes

Unfortunately the customerScore is not in a form that is readily aligned for a really simple SQL computation.

SQL way

One important aspect is to recognize the selection of map and woe for each score part from map_num can be done relatively easily in SQL, but processing the individual variables has to be 'coaxed' through macro

Consider only the variable A from the first map_num as a example case.

select (
  map_num.woe * map_num.beta 
  from map_num  
  where map_num.variable="A"
    and map_num.start < customerScore.A <= mapnum.end
 ) as A_contribution_to_score
from 
  customerScore

Now consider the B contribution that is added to the overall expression

select (
  map_num.woe * map_num.beta 
  from map_num  
  where map_num.variable="A"
    and map_num.start < customerScore.A <= mapnum.end
  )
+ 
select (
  map_num.woe * map_num.beta 
  from map_num  
  where map_num.variable="B"
    and map_num.start < customerScore.B <= mapnum.end
  )
from 
  customerScore

You should see that a macro could determine the distinct map_num values of variable to be used to construct a rather lengthy SQL expression that searches for the appropriate woe and beta product to apply to each row in customerScore.

The macro and SQL update statement could be something like

%macro updateScore (data=, map=);

  %local i n_var;

  proc sql noprint;
    select distinct variable into :variable1- from &map;
    %let N_var = &sqlobs;

    update &data as OUTER
    set score = score
    %do I = 1 %to &N_var;
      %let variable = &&variable&i;
      +
      ( select 
        INNER.woe * INNER.beta
        from &map as INNER
        where INNER.variable="&variable"
          and INNER.start < OUTER.&variable <= INNER.end
      )
    %end;
    ; /* end of update statement */
  quit;
%mend;

%updateScore(data=customerScore, map=map_num)

Your data structure needs a bit of work if you want the score update made via a map_num to be reversible (i.e. capable of having an undo action applied).

If tracking the map selections is important you would want an additional similar query in the macro that creates a table recording the important aspects of the map data selection

    create table mapplication as 
    select cst_id
    %do I = 1 %to &N_var;
      %let variable = &&variable&i;
      %let innerness = from &map as INNER where INNER.variable="&variable" and INNER.start < OUTER.&variable <= INNER.end;
      , &variable
      , ( select INNER.woe &innerness ) as &variable._woe
      , ( select INNER.beta &innerness ) as &variable._beta
      , ( select INNER.start &innerness ) as &variable._start
      , ( select INNER.end &innerness ) as &variable._end
    %end;
    from &data as OUTER;

Examining the 'mapplication' data can possibly help diagnose bad map_num data.

0
votes

First let's start with a working set of data so we have something that SAS code can work with.

data cust ;
  input cst_id A B ;
cards;
1 688567   873
2   3198 54667
;
data map_data ;
  input variable :$32. start end woe beta ;
cards;
A  -999999999      57853   -1  3
A       57853      89756 -1.1  3
A       89756     897452 -1.2  3
A      897452 9999999999 -1.3  3
B  -999999999       4235    2  3
B        4235      65785    3  3
B       65785 9999999999    4  3
;

If you want to combine the first table with the second then you need to transpose it.

proc transpose data=cust out=cust_data(rename=(col1=value)) name=variable  ;
  by cst_id ;
run;

The result for our small example looks like this.

Obs    cst_id    variable     value
 1        1         A        688567
 2        1         B           873
 3        2         A          3198
 4        2         B         54667

Since the transpose has moved the variable names into data values instead of metadata values we can now easily join the customer data with the map data.

I will assume that you only want the cases where the value of the variable falls between the START and END variables.

proc sql ;
  create table want as 
    select * 
    from cust_data a
    inner join map_data b
    on a.variable = b.variable
    and a.value between b.start and b.end 
    order by 1,2
  ;
quit;

For this little sample it would be this data.

Obs    cst_id    variable     value         start      end      woe    beta
 1        1         A        688567         89756    897452    -1.2      3
 2        1         B           873    -999999999      4235     2.0      3
 3        2         A          3198    -999999999     57853    -1.0      3
 4        2         B         54667          4235     65785     3.0      3

At this point you now have something that might make it possible to calculate a score, if you could explain what the formula is.

So assuming that you want to take the sum of WOE*BETA then your SQL query should probably look like this.

proc sql ;
  create table scores as 
    select a.cst_id,sum(woe*beta) as score 
    from cust_data a
    inner join map_data b
    on a.variable = b.variable
    and a.value between b.start and b.end 
    group by 1
    order by 1
  ;
quit;

Which has this result.

Obs    cst_id    score
 1        1       2.4
 2        2       6.0

Not sure where macro code or looping would help with this problem. If the names of the input dataset vary then you could use macro variables to hold the names, but the input dataset names are used only once each in this code.

For example you could make macro variables CUST, MAP and OUT.

%let cust=work.cust;
%let map=work.map_data;
%let out=work.scores;

Then replace the dataset names in the code with the macro variable references.

proc transpose data=&cust. out=cust_data(rename=(col1=value)) name=variable  ;
  by cst_id ;
run;
proc sql ;
  create table &out. as 
    select a.cst_id,sum(woe*beta) as score 
    from cust_data a
    inner join &map. b
    on a.variable = b.variable
    and a.value between b.start and b.end 
    group by 1
    order by 1
  ;
quit;