SAS merge datasets to overwrite values

Question

I want to use dataset B to overwrite some values in dataset A by merging dataset A & B with a merging ID. However it doesn't work as expected. Here is the test I did:

/* create table A */
data a;
  infile datalines;
  input id1 $ id2 $ var1;
datalines;
    1 a 10
    1 b 10
    2 a 10
    2 b 10
;
run;

/* create table B */
data b;
  infile datalines;
  input id1 $ var1 var2;
datalines;
    1 20 30
    2 20 30
;
run; 

/* merge A&B to overwrite var1 in table A using values in table B */
data c;
    merge a b;
    by id1;
run;

Table C looks like this:

 ID1 ID2 VAR1 VAR2

 1   a     20   30

 1   b     10   30

 2   a     20   30

 2   b     10   30

Why the 10s in row 2&4 didn't get replaced by 20 from table B? While var2 works as expected?

I know I can do this simply using proc SQL, and that's what I did to solve the problem. But I still quite curious if there is a way to do what I wanted using merge? And why this wasn't working? I prefer merge over SQL in this circumstance because the logic is easier to implement (util I found this not working properly).

I use SAS 9.4.

You have duplicate values by id1 in table A. I think there's no simpe approach by using any merge/update/modify to achieve this. — Lovnlust
Don't use code snippets with anything other than the languages they explicitly work with (mostly, html/javascript/etc.). — Joe
Also relevant (but not an exact dup), stackoverflow.com/questions/25251177/… — Joe

DomPazz DomPazz · Accepted Answer · 2015-03-04T03:21:57

This has to do with how SAS iterates over the data sets during the merge. Basically, the second record for each of A doesn't get lined up with a record from B. The value of VAR2 is carried over from the previous record. VAR1 gets its value from A (because there is no B).

IF there is record in B for EVERY ID1, then you can rewrite your merge like this to achieve what you want.

/* merge A&B to overwrite var1 in table A using values in table B */
data c;
    merge a(drop=var1) b;
    by id1;
run;

This drops the VAR1 from A so that it is carried down from the record in B.

Otherwise you will need more complex logic (might I suggest an SQL left join with the coalesce() function?).

SAS merge datasets to overwrite values

3 Answers