0
votes

My program makes a web-service call and receives a response in XML format which I store as output.txt. When opened in notepad, the file looks like this

<OwnerInquiryResponse xmlns="http://www.fedex.com/esotservice/schema"><ResponseHeader><TimeStamp time="2018-02-01T16:09:19.319Z"/></ResponseHeader><Owner><Employee firstName="Gerald" lastName="Harris" emplnbr="108181"/><SalesAttribute type="Sales"/><Territory NodeGlobalRegion="US" SegDesc="Worldwide Sales" SegNbr="1" TTY="2-2-1-2-1-1-10"/></Owner><Delegates/><AlignmentDetail><SalesAttribute type="Sales"/><Alignments/></AlignmentDetail></OwnerInquiryResponse>

I am unable to read this file into SAS using proc IMPORT. My SAS code is below

proc import datafile="/mktg/prc203/abhee/output.txt" out=work.test2 dbms=dlm replace;
delimiter='<>"=';
getnames=yes;
run;

My log is

1          %_eg_hidenotesandsource;
5          %_eg_hidenotesandsource;
28         


29         proc import datafile="/mktg/prc203/abhee/output.txt" out=work.test2 dbms=dlm replace;
30         delimiter='<>"=';
31         getnames=yes;
32         run;

NOTE: Unable to open parameter catalog: SASUSER.PARMS.PARMS.SLIST in update mode. Temporary parameter values will be saved to 
WORK.PARMS.PARMS.SLIST.
Unable to sample external file, no data in first 5 records.
ERROR: Import unsuccessful.  See SAS Log for details.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IMPORT used (Total process time):
      real time           0.09 seconds
      cpu time            0.09 seconds

33         
34         %_eg_hidenotesandsource;
46         
47         
48         %_eg_hidenotesandsource;
51

My ultimate goal is to mine Employee first name (Gerald), last name (Harris) and Employee Number (108181) from the above file and store it in the dataset (and then do this over and over again with a loop and upend the same dataset). If you can help regarding importing the entire file or just the information that I need directly, then that would help.

2
That is an XML file. Did you try the XML libname engine?Tom
Yes I did. That thread lead me to install XML mapper in SAS which I am not able/authorized to do in my workplace. Then my thought process was to simply see if text mining is a way go. I am still thinking that I will be able to use substr and index function to get what I want once the file is read.Abhee Brahmnalkar
Try using the AUTOMAP option and then manually editing the map file. documentation.sas.com/…Tom

2 Answers

1
votes

If you only need these three fields then named input a single input statement is perfectly viable, and arguably preferable to parsing xml with regex:

data want;
infile xmlfile dsd dlm = ' /';
input @"Employee" @"firstName=" firstName :$32. @"lastName=" lastName :$32. @"emplnbr=" emplnbr :8.;
run;

This uses the input file constructed in Richard's answer. The initial @Employee is optional but reduces the risk of picking up any fields with the same names as the desired ones that are subfields of a different top-level field.

Bonus: the same approach can also be used to import json files if you're in a similar situation.

0
votes

Since you are unable to use the preferred methods of reading xml data, and you are processing a single record result from a service query the git'er done approach seems warranted.

One idea that did not pan out was to use named input.

input @'Employee' lastname= firstname= emplnbr=;

The results could not be made to strip the quotes with $QUOTE. informat nor honor infile dlm=' /'

An approach that did work was to read the single line and parse the value out using a regular expression with capture groups. PRXPARSE is used to compile a pattern, PRXMATCH to test for a match and PRXPOSN to retrieve the capture group.

* create a file to read from (represents the file from the service call capture);
options ls=max;
filename xmlfile "%sysfunc(pathname(WORK))\1-service-call-record.xml";

data have;
  input;
  file xmlfile;
  put _infile_;
  datalines;
<OwnerInquiryResponse xmlns="http://www.fedex.com/esotservice/schema"><ResponseHeader><TimeStamp time="2018-02-01T16:09:19.319Z"/></ResponseHeader><Owner><Employee firstName="Gerald" lastName="Harris" emplnbr="108181"/><SalesAttribute type="Sales"/><Territory NodeGlobalRegion="US" SegDesc="Worldwide Sales" SegNbr="1" TTY="2-2-1-2-1-1-10"/></Owner><Delegates/><AlignmentDetail><SalesAttribute type="Sales"/><Alignments/></AlignmentDetail></OwnerInquiryResponse>
run;

* read the entire line from the file and parse out the values using Perl regular expression;

data want;
  infile xmlfile;
  input;
  rx_employee = prxparse('/employee\s+firstname="([^"]+)"\s+lastname="([^"]+)"\s+emplnbr="([^"]+)"/i');
  if prxmatch(rx_employee,_infile_) then do;
    firstname = prxposn(rx_employee, 1, _infile_);
    lastname  = prxposn(rx_employee, 2, _infile_);
    emplnbr   = prxposn(rx_employee, 3, _infile_);
  end;
  keep firstname last emplnbr;
run;