I have some data sets that I wrote some code to clean according to some methods according to some biological literature and then I want to split it into day and night (because they must be analyzed separately). It worked but now I need to do this for the full set which is WAY to many files for me to want to deal with one by one. So I am now trying to write a macro to split it into days and nights for me..
My data looks like so
Hour var1 var2 var3
1 123 90 100
2 122 99 108
...........
4 156 80 120
4 156 80 145
4 143 82 132
basically night has 1 obs per hour day 3. I also have this for many days.
Each dataset is named STUDYIDID#_first or STUDYID_ID#_last. I want to generate four datasets per dataset. So MYID111_first would create: MYID111_first_day_var1, MYID111_first_day_var2, MYID111_first_night_var1 , and MYID111_first_night_var2.
I would then LIKE to append them into 4 datasets: MYID_A_first_day_var1, MYID_A_first_day_var2, MYID_A_first_night_var1 , and MYID_A_first_night_var2.
MY CODE SO FAR:
%macro datacut(libname,worklib=work, grp = _A ,time1 = _night , time2 = _day type1 = _var1 , type2 = _var2);
%local num i;
proc datasets library=&libname memtype=data nodetails;
contents out=&worklib..temp1(keep=memname) data=_all_ noprint;
run;
data _null_;
set &worklib..temp1 end=final;
by memname notsorted;
if last.memname;
n+1;
call symput('ds'||left(put(n,8.)),trim(memname));
if final then call symput('num',put(n,8.));
run;
%do i=1 %to #
/* do the artifact removing method */
DATA &libname..&&ds&i;
SET &libname..&&ds&i;
PT_ID = '&ds&i' ;
IF var1< 60 OR var1> 230 then delete;
IF var2< 30 OR var2> 230 THEN delete;
IF var3< 60OR var3 > 135 THEN DELETE;
IF var2 > var1 then delete;
run;
/* get just the night values */
PROC SQL;
CREATE TABLE &libname..&&ds&i&time1 as
SELECT *
FROM &libname..&&ds&i
WHERE Hour BETWEEN 0 and 6 OR Hour BETWEEN 22 and 24
order by systolic
;
QUIT;
/* trim off the proper number of observations for variable 1 */
DATA &libname..&&ds&i&time1&type1;
SET &libname..&&ds&i&time1 end=eof;
IF _N_ =1 then delete;
if eof then delete;
run;
PROC append base= &libname..&&ds&time1&type1
data= &libname..&&ds&i&time1;
run;
QUIT;
%end;
%mend datacut;
%datacut(work)
Now the initial datastep works correctly but the later ones don't rename the data as planned. I get a bunch of datasets called Ds10_night_var1 with the wrong field names (memtype, nodetails, data)
I get the warning:
WARNING: Apparent symbolic reference DS1_NIGHT not resolved.
NOTE: Line generated by the macro variable "TIME1".
1 work.&ds1_night
-
22
200
ERROR 22-322: Expecting a name.
ERROR 200-322: The symbol is not recognized and will be ignored.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
WARNING: Apparent symbolic reference DS1_NIGHT_SYS not resolved.
22: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
COLUMN where the error has occurred.
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, /, ;,
_DATA_, _LAST_, _NULL_.
201: LINE and COLUMN cannot be determined.
NOTE: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and COLUMN
where the error has occurred.
ERROR 201-322: The option is not recognized and will be ignored.
So I want the right names for my file AND my datasets to actually have data I and I don't understand why they don't.
option mprint symbolgen
on to see where the error is. Right now hard to see what step is generating the error. My guess is your have an extra&
somewhere. Also, macro variables only resolve in double quotes for your PT_ID. – Reeza