1
votes

Im running the array code below

DATA Want;
    SET Have;
        ARRAY Dates{2562} (&Start_Date:&End_Date);
        DO i = 1 TO DIM(Dates);
            IF Dates[i] >= ObStartDate AND Dates[i] <= ObEndDate THEN Dates[i] = 1;
        END;
RUN;

I have found the minimum date (ie first Obstartdate date of my dataset) and the maximum date (ie last ObEndDate date of my dataset) and those values are set to &Start_Date and &End_Date. The array creates itself correctly and enters unformatted SAS date values for each observation. I want to also run through each observation and say if the value in each of the array Dates columns are between the Observations individual Start and End date then replace that value with 1.

Heres where it starts to go wrong. It retains the ObStartDate and ObEndDate from observation to observation and only replaces different Dates[i] when it picks up a lower ObStartDate or higher ObEndDate.

Is there a way I can reset ObStartDate and ObEndDate to the value of each observations ObStartDate and ObEndDate when the Arrays Do Loop gets to each consecutive observation

Ive tried creating the array and doing a Do Loop in a different datastep. Ive also tried putting loops inside loops inside loops and arrays inside loops etc etc. I may have been close to success but this is the code that I thought would work and the first code that i wrote.

Any help will be greatly appreciated. Cheers.

Here is some code to see what I mean

DATA Haveyay;
    ATTRIB Ob LENGTH=3
            ObStartDate Length=3
            ObEndDate Length=3;
INFILE datalines DELIMITER='~';
    INPUT Ob ObStartDate ObEndDate ;
    DATALINES;
    1~1~8
    2~2~5
    3~5~10
    4~1~4
    5~2~3
    6~4~7
    7~7~10
    8~3~4
    9~3~9
    10~2~9
;
RUN;

PROC SQL Noprint;
    SELECT min(ObStartDate), max(ObEndDate) into :Start_Date, :End_Date
    FROM Haveyay;
QUIT;

DATA Wantyay;
    SET Haveyay;
        ARRAY Dates{10} (&Start_Date:&End_Date);
            DO i = 1 to DIM(Dates);
                IF Dates[i] >= ObStartDate AND Dates[i] <= ObEndDate THEN Dates[i] = 1;
            END;
RUN;
1
It will not 'retain' ObsStartDate, or more accurately assuming that's a variable on the set dataset, it actually does retain but it will replace it each iteration. So if that's your code exactly, that won't do what you're describing. It's possible your input dataset is constructed wrong in the first place.Joe
If you want useful help provide example data and example output, in any event. Give us a HAVE dataset, and show what WANT should look like.Joe
Thanks for your quick reply. Any chance you could elaborate on how would I go about checking if my input dataset is constructed wrong. Is the Array just assuming its one big observation and not picking up on when the next observation starts? Would an some sort of observation indexation work?LLNZ
I will start constructing an example dataset nowLLNZ
Please edit into the question, and don't use an image. Write a datastep that will input your variables and values. That way we can just copy/paste that instead of having to do work.Joe

1 Answers

2
votes

It looks like your problem may be that you are expecting the values in the dates array to be reset to their original values with each observation. In reality the array statement initialises the value in the array only once, before any data is loaded. As the array variables are automatically retained each change you make to a member of the array will be carried forward into later observations.

You can use a second loop to reset the date values after outputting:

do i = 1 to dim(dates);
    if obstartdate <= dates[i] <= obenddate then dates[i] = 1;
end;
output;
do i = 1 to dim(dates);
    dates[i] = &start_date. + i - 1;
end;

Or more compactly calculate the date from i and the macro variable rather than the array:

do i = 1 to dim(dates);
    _date = &start_date + i - 1;
    dates[i] = ifn(ObStartDate <= _date <= ObEndDate , 1, _date);
end;