I am using the code below to loop through a directory and find all the .xlsx files in each subfolder. I need to adapt it to take the path to each found file and stack it into a dataset as it runs.
data thisfile;
set _NULL_;
run;
%macro drive(dir,ext);
%local filrf rc did memcnt name i;
/* Assigns a fileref to the directory and opens the directory */
%let rc=%sysfunc(filename(filrf,&dir));
%let did=%sysfunc(dopen(&filrf));
/* Make sure directory can be open */
%if &did eq 0 %then %do;
%put Directory &dir cannot be open or does not exist;
%return;
%end;
/* Loops through entire directory */
%do i = 1 %to %sysfunc(dnum(&did));
/* Retrieve name of each file */
%let name=%qsysfunc(dread(&did,&i));
/* Checks to see if the extension matches the parameter value */
/* If condition is true print the full name to the log */
%if %qupcase(%qscan(&name,-1,.)) = %upcase(&ext) %then %do;
%put &dir\&name;
data thisfile_&i.;
format filename $100.;
set thisfile;
filename = "&dir.\&name.";
run;
%end;
/* If directory name call macro again */
%else %if %qscan(&name,2,.) = %then %do;
%drive(&dir\%unquote(&name),&ext)
%end;
%end;
/* Closes the directory and clear the fileref */
%let rc=%sysfunc(dclose(&did));
%let rc=%sysfunc(filename(filrf));
%mend drive;
/* First parameter is the directory of where your files are stored. */
/* Second parameter is the extension you are looking for. */
%drive(v:\,xlsx)
The code works as intended, but I need the result of the line '%put &dir\&name;' to feed into a dataset instead of just printing to the log, so that when the code is done running I will have a dataset with one observation for each .xlsx file.
This seems straightforward enough, but when I attempt to have it create a dataset and initialize a 'filename' variable using &dir\&name, I get a dataset with zero observations. My log results indicate the macro argument is working properly but it doesn't write an observation:
MPRINT(DRIVE): data thisfile_1;
MPRINT(DRIVE): format filename $100.;
MPRINT(DRIVE): set thisfile;
MPRINT(DRIVE): filename = "v:\\MYFILENAME.xlsx";
MPRINT(DRIVE): run;
NOTE: There were 0 observations read from the data set WORK.THISFILE.
NOTE: The data set WORK.THISFILE_1 has 0 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
I know that the code as written will not 'stack' all my results into one set, but I was just trying to get it up and running.