1
votes

Background is that I need to use filename command to execute grep and use the result as input.

Here is my input data set named test

firstname   lastname   filename
<blank>     <blank>    cus_01.txt
<blank>     <blank>    cus_02.txt

Filename values are actual files which I need to grep because I need certain string inside those files to fill up the firstname and lastname

Here is the code:

data work.test;
   set work.test;
   call symputx('file', filename);
   filename fname pipe "grep ""Firstname"" <path>/&file.";
   filename lname pipe "grep ""Lastname"" <path>/&file.";
   infile fname;
   input firstname;
   infile lname;
   input lastname; 
run;

However, macro variables created inside a data step can't be used until after the data step procedure is completed. So, that means, &file. can't be resolved and can't be used in filename.

Is there a way to for resolve the macro variable?

Thanks!

2
Yes, you can use the RESOLVE function. But I don't think that will help you. The filename statement is a global statement. You've put it inside a data step, but it would still be executed before the data step has even compiled. If you describe more about records in cus_01.txt and cus_02.txt, there are probably better ways to process them in SAS than using grep.Quentin
How can I use the RESOLVE function in this? Isn't that RESOLVE only put back the values into data step variable? cus records have line like: "Firstname: Lonzo" "Lastname: Ball"Juan Dela Cruz
I don't think it can help you the way the code is structured now. The RESOLVE function can be used to resolve a macro variable created in the same data step. But it resolves when the data step code executes. As you have the code structured now, your filename statement will execute before the data step code executes (or is even compiled), so I don't see how your current structure could work.Quentin
I get it now. This is the only way I can pull the required string from the textfiles. Can you suggest another way?Juan Dela Cruz

2 Answers

2
votes

This is not tested. You need to use the INFILE statement option FILEVAR.

data test;
   input (firstname   lastname   filename) (:$20.);
   cards;
<blank>     <blank>    cus_01.txt
<blank>     <blank>    cus_02.txt
;;;;
   run;

data work.grep;
   set work.test;
   length cmd $128;
   cmd = catx(' ','grep',quote(strip(firstname)),filename);
   putlog 'NOTE: ' cmd=;
   infile dummy pipe filevar=cmd end=eof;
   do while(not eof);
      input;
      *something;
      output;
      end;
   run;
0
votes

If you have many customer files the use of pipe to grep can be an expensive operating system action, and on SAS servers potentially disallowed (pipe, x, system, etc...)

You can read all pattern-named files in a single data step using the wildcard feature of infile and the filename= option to capture the active file being read from.

Sample:

%let sandbox_path = %sysfunc(pathname(WORK));

* create 99 customer files, each with 20 customers;

data _null_;
  length outfile $125;
  do index = 1 to 99;
    outfile = "&sandbox_path./" || 'cust_' || put(index,z2.) || '.txt';
    file huzzah filevar=outfile;
    putlog outfile=;

    do _n_ = 1 to 20;
      custid+1;
      put custid=;
      put "firstname=Joe" custid;
      put "lastname=Schmoe" custid;
      put "street=";
      put "city=";
      put "zip=";
      put "----------";
    end;
  end;
run;

* read all the customer files in the path;
* scan each line for 'landmarks' -- either 'lastname' or 'firstname';    

data want;
  length from_whence source $128;
  infile "&sandbox_path./cust_*.txt" filename=from_whence ;
  source = from_whence;
  input;

  select;
    when (index(_infile_,"firstname")) topic="firstname";
    when (index(_infile_,"lastname")) topic="lastname";
    otherwise;
  end;

  if not missing(topic);

  line_read = _infile_;
run;