I have approximately 27,000 .csv files on a SAS directory, these have also been copied onto a network drive so I have the option of using either.
I need to merge all the csv files into one data set and delete any empty rows (the empty rows contain a comma in column A).
Each CSV file has a unique name but the structure and formatting is the same.
In the final dataset… I’d like column A to contain the filename with data being copied from the import file range A1:down to S1:down into columns B to T.
I’ve tried with the below code, but it fails after around 1,000 files:
x 'cd C:\temp';
filename csv ('*.csv');
proc import out=work.LSPImportFiles
datafile = csv DBMS=CSV REPLACE;
GETNAMES=yes;
run;
I’ve also tried the code below, but this seems to miss out columns 8 and 9 and fails after few a hundred:
data want;
length _filename_ $32;
infile C:\temp\*.csv" dlm = ',' filename = _filename_;
input @;
if _filename_ ne lag1(_filename_) then delete;
input
Column_1 :$15.
Column_2 :$16.
Column_3 /*this is a number to 2 decimal places*/
Column_4 /*this is a number to 2 decimal places*/
Column_5 /*this is a number to 2 decimal places*/
Column_6 /*this is a number to 2 decimal places*/
Column_7 /*this is a number to 2 decimal places*/
Column_8 /*this is a percentage*/
Column_9 /*this is a percentage*/
Column_10 /*this is a number to 2 decimal places*/
Column_11 /*this is a number to 2 decimal places*/
Column_12 /*this is a number to 2 decimal places*/
Column_13 /*this is a number to 2 decimal places*/
Column_14 /*this is a blank column*/
Column_15 /*this is a number to 2 decimal places*/
Column_16 /*this is a number to 2 decimal places*/
Column_17 /*this is a number to 2 decimal places*/
run;