3
votes

I Have a column with many flags that were parsed from a XML parser. Data looks like this:

USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;

I have to create a table with all these column names to capture the flags. Like:

USERKEYED VALMATCH DEVICEVERIFIED EXCEPTION USERREGISTRD ASSOCIATE EXTERNAL GROSSGIVEN UMAPPED 
Y  N N N N Y N Y N 
Y  N N N N Y Y Y N 
Y  N N Y N Y N Y N 

How can I capture values dynamically in SAS? Either in a DATA step or a PROC step?

Thanks in advance.

4
Read in the data with the first variable as Variable and the second Variable as Value. Then transpose the data.Reeza
Is the string in a character variable in a dataset or in a text file that you want to read? If it is in a text file then perhaps you can read using named input style. But you would need to know the variable names to write the input statement.Tom
Thank you for the responses. For now I tried with the string functions and it is working fine. Eventhough it is working, the code is not looking satisfactory for me as in future if more flags are added to the column, it requires a code change.Naga Vemprala
Perhaps you should post that code, so we can work off that in giving you suggestions.Joe
my apologies. I just posted. In comments, it was not coming good. so answered my own question. Still I request you to provide me a better solution, please.Naga Vemprala

4 Answers

2
votes

Let's start with your example output data.

data expect ;
  id+1;
  length USERKEYED VALMATCH DEVICEVERIFIED EXCEPTION 
         USERREGISTRD ASSOCIATE EXTERNAL GROSSGIVEN UMAPPED $1 ;
  input USERKEYED -- UMAPPED;
cards4;
Y  N N N N Y N Y N 
Y  N N N N Y Y Y N 
Y  N N Y N Y N Y N 
;;;;

Now we can recreate your example input data:

data have ;
  do until (last.id);
    set expect ;
    by id ;
    array flag _character_;
    length string $200 ;
    do _n_=1 to dim(flag);
      string=catx(';',string,catx('=',vname(flag(_n_)),flag(_n_)));
    end;
  end;
  keep id string;
run;

Which will look like this:

USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N 
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=Y;GROSSGIVEN=Y;UMAPPED=N 
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=Y;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N 

So to process this we need to parse out the pairs from the variable STRING into multiple observations with the individual pairs' values split into NAME and VALUE variables.

data middle ;
  set have ;
  do _n_=1 by 1 while(_n_=1 or scan(string,_n_,';')^=' ');
    length name $32 ;
    name = scan(scan(string,_n_,';'),1,'=');
    value = scan(scan(string,_n_,';'),2,'=');
    output;
  end;
  keep id name value ;
run;

Then we can use PROC TRANSPOSE to convert those observations into variables.

proc transpose data=middle out=want (drop=_name_) ;
  by id;
  id name ;
  var value ;
run;
1
votes

The data that you have is a series of name/value pairs, using a ; as a delimiter. We can extract each name/value pair one at a time, and then parse those into values:

data tmp;
  length my_string next_pair name value $200;
  my_string = "USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;";
  cnt = 1;
  next_pair = scan(my_string,cnt,";");
  do while (next_pair ne "");
    name = scan(next_pair,1,"=");
    value = scan(next_pair,2,"=");
    output;
    cnt = cnt + 1;
    next_pair = scan(my_string,cnt,";");
  end;
  keep name value;
run;

Gives us:

name                value
=================== =====
USERKEYED           Y
VALMATCH            N
DEVICEVERIFIED      N
EXCEPTION           N
USERREGISTRD        N
ASSOCIATE           Y
EXTERNAL            N
GROSSGIVEN          Y
UMAPPED             N

We can then transpose the data so that the name is used for the column names:

proc transpose data=tmp out=want(drop=_name_);
  id name;
  var value;
run;

Which gives you the desired table.

0
votes
DATA <MY_DATASET>;
SET INPUT_DATASET;
USERKEYED = substr(input_column, find(input_column, 'USERKEYED=')+10,1);
VALMATCH = substr(input_column, find(input_column, 'VALMATCH=')+9,1);
DEVICEVERIFIED = substr(input_column, find(input_column, 'DEVICEVERIFIED=')+15,1);
EXCEPTION = substr(input_column, find(input_column, 'EXCEPTION=')+10,1);
USERREGISTRD = substr(input_column, find(input_column, 'USERREGISTRD=')+13,1);
ASSOCIATE = substr(input_column, find(input_column, 'ASSOCIATE=')+10,1); EXTERNAL = substr(input_column, find(input_column, 'EXTERNAL=')+9,1);
GROSSGIVEN = substr(input_column, find(input_column, 'GROSSGIVEN=')+11,1);
UMAPPED = substr(input_column, find(input_column, UMAPPED=')+8,1);
run; 
0
votes

My answer is essentially in the first block of code, the rest is just explanation, one alternative and a nice tip.

Based on the answer you gave, the input data is already in a SAS data set, so that can be read to create a file of SAS code which can then be run using %include and so proc transpose is not required:

filename tempcode '<path><file-name.txt>'; /* set this up yourself */

/* write out SAS code to the fileref tempcode */
data _null_;
  file tempcode;
  set have;
  if _n_=1 then
    put 'Y="Y"; N="N"; drop Y N;';
  put input_column;
  put 'output;';
run;

/* %include the code to create the desired output */
data want;
  %include tempcode;
run;

As the input data already almost looks like SAS assignment statements, we have taken advantage of that and so the SAS code that has been run from fileref tempcode using %include should look like:

Y="Y"; N="N"; drop Y N;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;
output;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=Y;GROSSGIVEN=Y;UMAPPED=N;
output;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=Y;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;
output;

As an alternative, fileref tempcode could contain all of the code for data step "data want;":

/* write out entire SAS data step code to the fileref tempcode */
data _null_;
  file tempcode;
  set have end=lastrec;
  if _n_=1 then
    put 'data want;'
       /'Y="Y"; N="N"; drop Y N;';

  put input_column;
  put 'output;';

  if lastrec then
    put 'run;';
run;

%include tempcode; /* no need for surrounding SAS code */

As a tip, to see the code being processed by %include in the log you can use the following variation:

%include tempcode / source2;