1
votes

I have a set of csv files named m6.csv through m22.csv, where the number in the file name increments by 1 with each new file. All data sets have a variable called passedchecks, but in only some of the data sets is it useful. I would like to subset several - but not all - of the datasets (m8, m13, m18, and m19) and include only observations where passedchecks = 1 as they're read into SAS without writing a separate macro for these data sets.

The current macro uses proc import to read in the csv files, then it subsets all the new SAS data sets by the passedchecks flag - not just the four of interest. It ends by printing the contents of all data sets to the console.

%let rawcsv = C:\Users\Desktop\rawfolder\;

%macro inputter(first=6, last=22); /*change this when m20-m22 come in for wave 3 (DONE ON 03/02/2021)*/
%do i=&first. %to &last. %by 1;

proc import datafile="&&rawcsv.m&&i..csv" 
                 out=m&i replace 
                dbms=csv; 
        guessingrows=500;
run;

/*BEGIN: part I just want to do where i = 8, 13, 18, and 19*/

data m&i;
set m&i;
where passedchecks=1;
run;

/*END: part I just want to do where i = 8, 13, 18, and 19*/

proc contents data=m&i varnum; 
title2 "contents of m&&i file"; 
run;

%end;
%mend;

%inputter;

Is there a way to modify it so it handles the m8, m13, m18, and m19 data sets differently than the rest?

1
Why are you using PROC IMPORT to read text files? It must guess how to define the variables. So each text file could generate a dataset with different variable attributes. Just write your own data step to read the text files so that you have complete control over how the variables are defined.Tom
these are csv files and there are several thousands of variables in each. The order, structure, and varnames vary in each as well, which makes a data step solution substantially more time intensive and less reproducible if I add more files (e.g., and m23.csv). guessingrows is more compute time but many hours of less effort. The contents of each are reliable enough that the whole "different variable attributes" thing is not a concern. Though (not shown here) I merge the proc contents results and check.J.Q

1 Answers

1
votes

Sure can. Add a condition to check for those values. We'll add the minoperator and mindelimiter=' ' options and check if &i is in a space-separated list that we specify. This option tells the macro facility to treat IN as an operator and use spaces to separate the values to check. For example: &color IN red green blue;

%macro inputter(first=6, last=22, checkData=) / minoperator mindelimiter=' ';
    /*change this when m20-m22 come in for wave 3 (DONE ON 03/02/2021)*/

    /* Clean checkData of any extra spaces */
    %let checkData = %cmpres(&checkData.);

    %do i=&first. %to &last. %by 1;
    
        proc import datafile="&&rawcsv.m&&i..csv" 
                         out=m&i replace 
                        dbms=csv; 
                guessingrows=500;
        run;
    
        /*BEGIN: part I just want to do where i = 8, 13, 18, and 19*/
        
        %if(&i. IN &checkData.) %then %do;
            data m&i;
                set m&i;
                where passedchecks=1;
            run;
        %end;

        /*END: part I just want to do where i = 8, 13, 18, and 19*/
        
        proc contents data=m&i varnum; 
        title2 "contents of m&&i file"; 
        run;
    
    %end;
%mend;
    
%inputter(checkData=8 13 18 19);