0
votes

I am having some 50 excel files (xlsx format) to be imported to sas and then need to append it for analysis. All the excel files header are same i.e., the variable names are same for all the file. I need macro for importing and appending all of them at a time rather than importing all the files one by one and then later append it. Your help is much appreciated.

The other issue with the excel file is that there is a blank column next between the variable name and data points. I have written a code remove it using data step but came we write this also in the macro while importing.

Data  XXX.yyy;
  Set XXX.yyy;
    if missing(coalesceC(of ASC Brand Cdesc1 Cust_ DGM Desc Family Grp1 High_Level_Product_Desc 
    Issf Name Prod_Desc Product__Code RVP SA_Desc Terr_ UOM Yr 
    )) and missing(coalesce(of Acc Int_Margin M_Cost Mth Net_Sales Sls__ Uts )) then delete;

run;

2

2 Answers

1
votes

It sounds as though your existing code already does what you need it to do. I doubt there will be much of a performance gain from attempting to import all 50 files in one data step (which is possible via dde, but rather fiddly).

If your existing code is set up to process just one hard-coded file, I'd suggest using it to write a simple macro that takes one excel file as input, imports that file, and appends it to the master dataset. Then you can call the macro 50 times.

e.g. You could write the macro as something like this, incorporating all the relevant bits of your code, and replacing all references to specific files with macro variables:

%macro import_and_append(excel_file,base_dataset);
  proc import datafile = "&excel_file" dbms = excel out = t_import;
  run;

  proc append base = &base_dataset data = t_import;
  run;

  proc datasets lib = work nolist nowarn;
    delete t_import;
    run; 
  quit;
%mend;

Then you can call the macro like so:

%import_and_append(c:\excel_file_01.xls,work.master_dataset)
0
votes

Another way to do this would be to use the Excel LIBNAME Engine. You declare a library to each of your files, and then read all the sheets in 1 Data Step.

In this example, I have 2 workbooks (Book1.xlsx and Book2.xlsx) in C:\temp. All data is in Sheet1. 3 variables -- X, Y, and Z. Modify as needed for your purpose.

data files;
format file $12.;
input file $;
datalines;
Book1.xlsx
Book2.xlsx
;
run;

%macro read_excel(dir,outdata,files);
   data _null_;
   set &files end=last;

   call execute("libname t" || strip(put(_n_,8.)) || " excel '&dir\" || strip(file) || "';");

   if last then
       call symput("n",_n_);
   run;

   data &outdata;
   set 
   %do i=1 %to &n;
       t&i.."Sheet1$"n
   %end;
   ;

   a = sum(x,y,z);
   if missing(a) then delete;
   run;

   %do i=1 %to &n;
      libname t&i clear;
   %end;
%mend;

%read_excel(c:\temp,data_from_excel,files);