4
votes

Suppose I have two datasets,

--Table 1--      --Table 2--
ID  Amount       ID  Amount1 Code
A01   0.1        A01  0.3     x
A02   0.2        A02  0.2     y
A02   0.3        A03  0.4     g 
A03   0.4        A03  0.5     u
A05   0.6        B01  0.1     k

I am trying to create a new dataset (Table 3) by joining Table 1 and Table 2. The intended final result should look like this:

--Table 3--
ID  Amount  Amount1  Code
A01   0.1     .       .
A01   .       0.3     x
A02   0.2     0.2     y
A02   0.3     .       .
A03   0.4     0.4     g
A05   0.6     .       .
B01   .       0.1     k

where the table will be joined based on the ID with the amount and amount1 compared at the same time. I tried using PROC SQL FULL JOIN but the results appear to be a little weird. Thank you.

2
Are Table1 and Table2 big?Kostya
Yea, the Table 1 and Table 2 are each around 5gb in size.saspower

2 Answers

7
votes

The only thing that may not be obvious for a novice user is necessity to do coalesce() on IDs.

proc sql;
create table joined as
select coalesce(a.ID, b.ID) as ID, a.Amount, b.Amount1, b.Code
from Table1 a
full join Table2 b
on a.ID = b.ID;
quit;

Anyway, SAS way to do this is merge of the two tables. If you have the tables pre-sorted or have index on IDs, this would be also more efficient:

data merged;
merge table1 table2;
by ID;
run;
0
votes

The following code produces a result quite close to the result that you were asking, although I could not figure out why the "u" was set to missing in the your requested dataset.

    proc sql;
      create table joined as
      select coalesce(a.ID, b.ID) as ID, a.Amount, b.Amount, b.Code
      from Table1 a natural full join Table2(rename=(amount1=amount)) b
       ;
    quit;