2
votes

I have a SAS table which has a numeric variable age. I need to construct new variables depending on the value of age. New variables should have this logic:

  • if the 0<=age<=25 then age0=1 else age0=0
  • if the 26<=age<=40 then age25=1 else age25=0 //here age25 is different to age0!!

So I wrote this code using macro to avoid repetition:

%macro intervalle_age(var,var1,var2);
if (&var=>&var1) and (&var<=&var2);
then return 1;
else return 0;
%mend;

Then I call the macro to get values of each new variables:

age0=%intervalle_age(age,0,25);
age25=%intervalle_age(age,26,40);
age25=%intervalle_age(age,41,65);
age25=%intervalle_age(age,65,771);

But this doesn't work!

How can I resolve it, please? Thank you in advance!

2

2 Answers

2
votes

I agree with Nikolay that you should step back and avoid macro altogether. The sample code you posted appears to be incorrect, you have four conditionals for different age ranges being assigned to only two variables.

In SAS a logical evaluation resolves to 1 for true and 0 for false. Additionally numeric variables can be used in logical expressions with non-zero, non-missing values meaning true and false otherwise.

So a sequence of code for assigning age range flag variables would be:

age0  =  0 < age <= 25 ;
age25 = 25 < age <= 40 ;
age40 = 40 < age <= 65 ;
age65 = 65 < age <= 71 ;
age71 = 71 < age ;

Masking simple and readable SAS statements behind a wall of macro code can lead to maintenance issues and degrade future understanding. However if your use case was to construct many sets of these types of code blocks, a macro that is based the breakpoints could lead to better legibility and understanding.

data have; age = 22; bmi = 20; run;
options mprint;

* easier to understand and not prone to copy paste issues or typos;
data want;
  set have;
  %make_flag_variables (var=age, breakpoints=0 25 40 65 71)
  %make_flag_variables (var=bmi, breakpoints=0 18.5 25 30)
run;

Depends on this macro

%macro make_flag_variables (var=, breakpoints=);
  %local I BREAKPOINT SUFFIX_LOW RANGE_LOW SUFFIX_HIGH RANGE_HIGH;
  %let I = 1;
  %do %while (%length(%scan(&breakpoints,&I,%str( ))));
    %let BREAKPOINT = %scan(&breakpoints,&I,%str( ));

    %let SUFFIX_LOW = &SUFFIX_HIGH;
    %let SUFFIX_HIGH = %sysfunc(TRANSLATE(&BREAKPOINT,_,.));

    %let RANGE_LOW = &RANGE_HIGH;
    %let RANGE_HIGH = &BREAKPOINT;

    %if &I > 1 %then %do;
        &VAR.&SUFFIX_LOW = &RANGE_LOW < &VAR <= &RANGE_HIGH; /* data step source code emitted here */
    %end; 

    %let I = %eval ( &I + 1 );
  %end;
%mend;

Log snippet shows the code generation performed by the macro

92   data want;
93     set have;
94
95     %make_flag_variables (var=age, breakpoints=0 25 40 65 71)
MPRINT(MAKE_FLAG_VARIABLES):   age0 = 0 < age <= 25;
MPRINT(MAKE_FLAG_VARIABLES):   age25 = 25 < age <= 40;
MPRINT(MAKE_FLAG_VARIABLES):   age40 = 40 < age <= 65;
MPRINT(MAKE_FLAG_VARIABLES):   age65 = 65 < age <= 71;
96     %make_flag_variables (var=bmi, breakpoints=0 18.5 25 30)
MPRINT(MAKE_FLAG_VARIABLES):   bmi0 = 0 < bmi <= 18.5;
MPRINT(MAKE_FLAG_VARIABLES):   bmi18_5 = 18.5 < bmi <= 25;
MPRINT(MAKE_FLAG_VARIABLES):   bmi25 = 25 < bmi <= 30;
97   run;
1
votes

return doesn't have any special meaning in SAS macros. The macros are said to "generate" code, i.e. the macro invocation is replaced by the text, that's left after processing the things that the macro processor "understands" (basically, involving tokens (words) starting with & or %).

In your case the macro processor just expands the macro variables (the rest is just text, which the macro processor leaves untouched), resulting in:

age0=if (age=>0) and (age<=25);
then return 1;
else return 0;
age25=/*and so on*/

It's important to understand how the macro processor and regular execution interact (basically, all the macro expansions must be finished before the given DATA or PROC step starts executing).


To make this work you either need to generate the complete if statement, including the assignment to the output var:

%macro calc_age_interval(outvar, inputvar, lbound, ubound);
  if (&inputvar=>&lbound) and (&inputvar<=&ubound) then do;
    &outvar = 1;
  end; else do;
    &outvar = 0;
  end;
%mend calc_age_interval;
%calc_age_interval(outvar=age0, inputvar=age, lbound=0, ubound=25);

Or make it generate an expression, which will evaluate to either 0 or 1 at execution time (either by assigning the result directly to a variable (the result of boolean expression is either 1 or 0 anyway), or using IFN() to be more explicit):

%macro calc_age_interval(inputvar, lbound, ubound);
  ifn((&inputvar=>&lbound) and (&inputvar<=&ubound), 1, 0)
%mend;
age0 = %calc_age_interval(age, 0, 25); /* expands to age0=ifn(..., 1, 0); */

Taking a step back, I wouldn't bother with macros in this case at all. You can use the in (M:N) range notation or reset all output variables to 0, then do an if-elseif:

if age < 0 then age_missing_or_negative = 1;
else if age <= 25 then age0 = 1;
else if age <= 40 then age25 = 1;
...