0
votes

I have a very large number of datasets that are not consistently formatted - I am trying to read them into SAS and normalize them.

The basic need here is to locate a 'key column' that contains a certain string - from there I know what to do with all the variables to the left and right of that column.

The 'GREP' macro from the sas website (http://support.sas.com/kb/33/078.html) seems like it can handle this, but I need help adapting the code in the following ways:
1 - I only need to search one dataset at a time, already in the 'work' library.
2 - I need to capture the name of the variable (and the position number of it) that prints to the log at the end of this macro. This seems like it would be easy but it just returns the last column in the dataset instead of the (correct) column that prints to the log at the end.

Current code below:

%macro grep(librf,string);  /* parameters are unquoted, libref name, search string */
%let librf = %upcase(&librf);
  proc sql noprint;
    select left(put(count(*),8.)) into :numds
    from dictionary.tables
    where libname="&librf";

    select memname into :ds1 - :ds&numds
    from dictionary.tables
    where libname="&librf";

  %do i=1 %to &numds;
    proc sql noprint;
    select left(put(count(*),8.)) into :numvars
    from dictionary.columns
    where libname="&librf" and memname="&&ds&i" and type='char';

    /* create list of variable names and store in a macro variable */

    %if &numvars > 0 %then %do;
      select name into :var1 - :var&numvars 
      from dictionary.columns
      where libname="&librf" and memname="&&ds&i" and type='char';
      quit;

      data _null_;
        set &&ds&i;
          %do j=1 %to &numvars;
            if &&var&j = "&string" then
            put "String &string found in dataset &librf..&&ds&i for variable &&var&j";
          %end;
        run;
    %end;
  %end; 
%mend;
%grep(work,Source Location);  

The log returns: "String Source Location found in dataset WORK.RAW_IMPORT for variable C" (the third), which is correct.

I just need usable macro variables equal to "C" and "3" at the end. This macro will be part of a larger macro (or a prelude to it) so the two macro variables need to reset with each dataset I run through it. Thanks for any help offered.

1

1 Answers

0
votes

Please find the modification below, basically what I have done was to create global macro variables for dataset name and variable name which will feed as input to get the variable position using VARNUM function as below, ( change identified by **** )

%macro grep(librf,string);  
  %let librf = %upcase(&librf);  

  proc sql noprint;
    select left(put(count(*),8.)) into :numds
    from dictionary.tables
    where libname="&librf";

    select memname into :ds1 - :ds&numds
    from dictionary.tables
    where libname="&librf";

    %do i=1 %to &numds;
       proc sql noprint;
         select left(put(count(*),8.)) into :numvars
         from dictionary.columns
         where libname="&librf" and memname="&&ds&i" and type='char';

         /* create list of variable names and store in a macro variable */

         %if &numvars > 0 %then %do;
          select name into :var1 - :var&numvars 
          from dictionary.columns
          where libname="&librf" and memname="&&ds&i" and type='char';
          quit;

          %global var_pos var_nm var_ds;

          data _null_;
            set &&ds&i;
            %do j=1 %to &numvars;

            **** ADDED NEW CODE HERE ****;
            if &&var&j = "&string" then do; /* IF-DO nesting */;
              call symputx("var_nm","&&var&j"); /*Global Macro variable for Variable Name */
              call symputx("var_ds","&&ds&i"); /*Global Macro variable for Dataset Name */
             put "String &string found in dataset &librf..&&ds&i for variable &&var&j";
          %end;
        run;  

        **** ADDED NEW CODE HERE ****;
        %let dsid=%sysfunc(open(&var_ds,i)); /* Open Data set */
        %let var_pos=%sysfunc(varnum(&dsid,&var_nm));  /* Variable Position */
        %let rc=%sysfunc(close(&dsid)); /* Close Data set */;

      %end;
   %end; 
%mend;

%grep(work,Source Location); 

%put &=var_nm &=var_ds &=var_pos;