1
votes

Let say I have a table with 3 variable: var1, var2, var3 and another table of reference.

Now I want to create anothe variable: var4; var4 is computed from var1, var2 and var3 and the reference table.

I could create a macro to do that (for each line), but I dont know how to pass the value from the macro (a table) into a result. I wonder how can we do it by creating a function to get a value, not a macro?

I understand that if I can create a function like that (this function will include data step and proc summary), thing will be easy like:

var4 = myfunction(var1, var2, var3).

Update:

The table reference in my case is:

age range1 range2 range3
1   10     1       8 
2   0      4       1
3   4      6       1
4   6      5       2
5   10     5       6

So I want my function to get like:

var4 = myfunction(var1, var2, var3), for example var1 = 2, var2 = 2, var3 = 5:

Take the sum of range2 (correspond to var1 = 2); from line 2 to line 5 (corrspond to var2= 2 and var3 = 5) and the result will be: 4 + 6 + 5 + 5.

Thanks in advance.

2
Please explain in more detail what your "function" is. Also what information are you getting from the other table? Do you mean that you need match observations between the two tables to figure out what numbers to use in the "function"? Or does the function need to use ALL of the information from the other table in every calculation. How large is the table the function needs to use? Small enough to fit in memory?Tom
You're going to have to provide more details and I suspect this is the least efficient way to do this task.Reeza
Probably don't need macro. What is in the reference table (structure and values), and what role does it have in the computation ?Richard
I just updated my question, please take a look at it.tmtran99
So it sounds like a better name for your functions parameters would be : newvar = myfunction(column_number,first_row,last_row)?Tom

2 Answers

1
votes

One solution could be to first modify the reference table to a long format with one row per age per range number, and with the accumulated range in a new variable. The sum from var2 to var3 then becomes the difference between two accumulated values, which is easier to compute in a double join:

* Define input data with parameters;
data have;
input var1 var2 var3;
datalines;
1 1 3
2 2 5
3 1 4
;
run;

* Define reference table;
data ref_table;
input age range1 range2 range3;
datalines;
1  10  1  8 
2  0   4  1
3  4   6  1
4  6   5  2
5  10  5  6
;
run;

* Modify reference table to long format with accumulated ranges;
data mod_ref_table;
   set ref_table;

   * Calculated accumulated values of each range;
   acc_range1 + range1;
   acc_range2 + range2;
   acc_range3 + range3;

   * Output the accumulated values for each range;
   range_no = 1;
   acc_range = acc_range1;
   output;
   range_no = 2;
   acc_range = acc_range2;
   output;
   range_no = 3;
   acc_range = acc_range3;
   output;
   
   keep age range_no acc_range;
run;

* Calculate output;
proc sql;
   create table want as 
      select a.*
            ,case 
               when a.var2 = 1 then c.acc_range
               else c.acc_range - b.acc_range
            end as val4
      from have as a
      left join mod_ref_table as b
         on a.var1 = b.range_no and a.var2-1 = b.age
      left join mod_ref_table as c
         on a.var1 = c.range_no and a.var3 = c.age 
   ;
quit;

0
votes

Macro returning a value

OK the solution can be found here.

Anyway thanks for your support.