2
votes

I am trying to write a SAS script that will simply read in a SAS .sas7bdat data file and output it in text format. I want dates to be output in YYYYMMDD format. I don't know what the names of the date columns will be. My script is currently:

libname tmplib '~/testdatadir/';
OPTIONS MISSING='00'x;
data tmpdata;
set tmplib.testdatafile;
array flds{*} _NUMERIC_;
do i=1 to dim(flds);
  if missing(flds(i)) then flds(i)=.;
end;
array charflds{*} _CHARACTER_;
do i=1 to dim(charflds);
  if missing(charflds(i)) then charflds(i)=' ';
end;
drop i;
RUN;

PROC EXPORT
  DATA = tmpdata
  OUTFILE = 'testdataoutfile.txt'
  DBMS = TAB REPLACE;
  PUTNAME = YES;
RUN;

I would either like to iterate through all date fields (as I do with NUMERIC fields and CHARACTER fields), or add a check for each NUMERIC field testing whether it is a date (then I could change the format), or add an option to PROC EXPORT to indicate the output date format. Any other approach to get the output file to have dates formatted as YYYYMMDD would be acceptable as well.

2
is there something in your data that will indicate a variable is a date? is it already formatted as a date? does it have something in the name that indicates it is a date?Jay Corbett

2 Answers

5
votes

You can use the varfmt function to find the format of a numeric variable, which can help you determine whether it’s a date, i.e. whether its format is a date format. Technically you could have a date that isn’t in a correct date format, so it’s just displayed as 16239 or something, but those are hard to detect because it could also just be the number 16,239. This method will find anything that’s displayed as a date in the data browser.

Then use some combination of put and input to get it into YYYYMMDD format.

One issue with just using put is that I think by default it will return a character variable, so you’ll either need to create a new character variable to hold the date, or convert the YYYYMMDD back into an 8-digit number, so the number 20120501 instead of the character string 20120501.

For example 2, you should note that varfmt returns the format of a variable given the dataset name and a variable number. In the example they set up a separate vars table whose purpose is to loop through all the variables.

4
votes

I would suggest changing the variable format instead. Using put/input may cause incorrect results when the variable format is not changed. For example, if you change dates to years using put/input, but leave the variable format as date9, then the dates will still be read as date9 (the underlying value will be year). If you export to CSV the dates will be incorrect.

Here's a simple macro that will check all variables and isolate the dates in order to modify their format to year. You would need to specify which date formats could be in your dataset (or create a more exhaustive list than what is in the macro), and modify it to produce the data format you need.

%macro _toyear(dsin=,dsout=);
/* proc contents will list all variables in the dataset, with formats */
proc contents data=&dsin out=_contents noprint; 
    run;
data _contents; 
    set _contents (where=(format in: ("DATE", "MMDDYY", "MMYY"))); 
    run;

/* use proc sql to create a macro variable with a list of the date variables */
proc sql noprint;
    selet name into: datevars separated by " " from _contents;
    quit;

/* simple error checking, in case there are no date variables */
%let dsid = %sysfunc(open(_contents, is));
%let nlobs = %sysfunc(attrn(&dsid, nlobs));

/* output dataset */
data &dsout;
    set &dsin;
    %if &nlobs ne 0 %then %do; format &datevars year4.; %end;
    run;

/* clean-up */
%let rc = %sysfunc(close(&dsid));
proc datasets nolist; delete _contents;
    run;
%mend _toyear;