0
votes

I need to save the value of a certain variable in a data step in a macro and then use that macro in the same data step. i tried with SYMPUT, but if I've understood correctly a macro variable created that way cannot be used inside the same data step (it's assigned at the end of the data step, I guess?)

Here is a simplified example. I have a list of data fields t1,...,t100 representing something happening at certain times, events being represented by numbers, and a data field t_start giving me the starting time for the process in which I'm interested for every observation. I want to check if I have all the data, and drop the observation otherwise. I want to proceed as follows.

DATA WANT;
    SET HAVE;

    CALL SYMPUT('START_TIME', t_start);

    DO I=&START_TIME. TO 100;
        IF t_&I. = . THEN DELETE;
    END;
RUN;

This doesn't work, I think for the reasons mentioned above. Is there a workaround?

Remarks:

  1. I simplified the situation, the real case I'm looking at is more complex (e.g. the variables are not called t1,...t100 but something with a bit more structure). If possible, I would like to get something as close to the approach I gave above as possible, as different solutions might not be applicable to my case. Of course, if this is not possible then any solution is more than welcome!
  2. I tried looking at RESOLVE, but it doesn't seem to do what I'm looking for (or at least I don't understand it well enough to make it do what i desire).
  3. As a last resort, I could try to solve the problem using two data steps, one defining the macro variables, the other one using it to check and delete the undesired observations. I would prefer avoiding this solution if possible.

Update: I solved the problem using arrays, as suggested in the solutions.

3
Can you use arrays? Why can't you replace start_time with t_start directly? That wouldn't affect anything in the example show, but the t_&i will not work because you don't have an I macro variable. I suspect you really need arrays but without more details that's all I can say now.Reeza
@Reeza Using t_start directly unfortunately won't work in my "true" case, because in t_start i have a string, and only a part of the string is the particle of the other variables (the I). Maybe there's a way to do it like that, I'll try on Monday if i don't find any direct solution.Daniel Robert-Nicoud
If you have a string that won't work anyways as indicated.Reeza
@Reeza Yes, I would have to work a bit on the START_TIME variable, e.g. by %LET START_TIME_BIS = %SUBSTR(START_TIME,1,2); or something like that. But the line of reasoning would remain the same.Daniel Robert-Nicoud
No reason you can't do that in a data step either though, in fact it's easier, no need for %SYSFUNC() and you can use COMPRESS() to remove characters if needed.Reeza

3 Answers

2
votes

You are trying to use the value of the dataset variables t_start and i to figure out which variable to test. That is what arrays are for.

DATA WANT;
  SET HAVE;
  array t t1-t100;
  DO I=t_start TO 100;
    IF t(i) = . THEN DELETE;
  END;
RUN;

No need for macro variables, much less macro variables that can travel backwards in time and modify the code of a data step after it has already started running.

1
votes

This may be a better approach using arrays. Given what you've posted, this works. If it doesn't match what you need, please post more details about your situation.

data demo;
array t(10);
do row=1 to 100;
do i=1 to 10;
t(i)=rand('integer', 1, 5);
end;
start = rand('integer', 1, 10);
output;
end;
run;

data test;
set demo;

array t(10);



do i=start to dim(t);
if t(i) < 2 then do;
    delete;
    leave;*exits loop;
end;
end;

run;
1
votes

When a data step is running the step has been 'compiled' and all ampersand ( & ) macro variable resolutions have already been resolved. A running compiled step can not alter its source code.

If you submitted your code twice, the first time would log a WARNING: Apparent symbolic reference not resolved. The second time would not have the warning, and would be using the value populated from the prior submission.

Suppose your data record has many variables, and two sentinel variables whose values are the names of the variables that mark the start and stop of some processing to occur. While an unwieldy data construct, you can use an array to mediate access to the variant set of variables to be processed.

For example:

data have;
input 
 a  b  c  d  e  f  g  h start $ stop $ ; datalines;
 1  2  3  4  5  6  7  8  d  e
11 12 13 14 15 16 17 18  a  b
 0  1  1  2  3  5  8 11  c  h
 1  1  1  1  1  .  1  1  a  e  wont be deleted because . is at f
 1  2  3  4  .  6  7  8  a  h
run;

data want;
  set have;
  array num a--h;
  do i = 1 to dim(num);
    if vname(num(i)) = start then startindex=i;
    if vname(num(i)) = stop  then stopindex=i;
  end;

  do i = min(startindex,stopindex) to max(startindex,stopindex);
    if missing(num(i)) then delete;
  end;
run;