3
votes

I have a dataset that looks like this, and am using SAS Enterprise Guide 6.3:

data have;
    input id state $;
    cards;
134 NC,NY,SC
145 AL,NC,NY,SC
;
run;

I have another dataset that has several metrics for each id in every state, but I only need to pull the data for the states listed in the second column of the have dataset.

data complete;
    input id state $ metric;
cards;
134 AL 5
134 NC 4.3
134 NY 4
134 SC 5.5
145 AL 1.3
145 NC 1.3
145 NY 1.5
145 SC 1.1
177 AL 10
177 NC 74
177 NY 23
177 SC 33
;
run;

I thought about using trnwrd to replace the comma with ', ' and concatenating a beginning and ending quote to make the list a character list, so that I could use a WHERE IN statement. However, I think it would be more helpful if I could somehow transpose the comma separated list to something like this:

data have_mod;
        input id state $;
        cards;
    134 NC
    134 NY
    134 SC
    145 AL
    145 NC
    145 NY
    145 SC
    ;
    run;

Then I could simply join this table to the complete data table to get the subset I need (below).

data want;
    input id state $ metric;
cards;
134 NC 4.3
134 NY 4
134 SC 5.5
145 AL 1.3
145 NC 1.3
145 NY 1.5
145 SC 1.1
;
run;  

Any thoughts? Thanks.

1
6.3 means 6.1 and 9.3 I assume? Or something else? (Versions i'm familiar with are 4.1, 4.3, 5.1, 6.1, 7.1, unless there are smaller versions that I've never heard of?)Joe
Yes 6.1 sorry. Thanks.pyll

1 Answers

3
votes

I'd do exactly what you propose and transpose it - except i'd read it in that way.

data have;
    infile datalines truncover dlm=', ';
    length state $2;
    input id @;   *read in the id for that line;
        do until (state='');   *keep reading in until state is missing = EOL;
            input state $ @;
            if not missing(state) then output;
        end;
    cards;
134 NC,NY,SC
145 AL,NC,NY,SC
;
run;

Alternately, you can SCAN for the first statecode.

data want_to_merge;
  set have;
  state_first = scan(state,1,',');  
run;

SCAN is the function equivalent of reading in a delimited file.