0
votes

I am trying to merge two data sets with a lot of the same observations except one has a column more. Dataset1 contains data for 20% of the observations and Dataset2 contains observations for the other 80% (+ one column extra). If i run the code below, I get 4037 observations when i merge the two datasets with 315 observation!

proc sql;
create table RateExposure as
select *
from Dataset1, Dataset2
where Dataset1.ID = Dataset2.ID
order by Dataset1.ID;
quit;

If I run

data newDataset;
merge Dataset1 Dataset2;
by ID;
run;

I only get observation for one of the datasets - how can this be?

1
Maybe for the first code there will be better to use SELECT t1.*, t2.extra_column FROM Dataset1 AS t1, Dataset2 AS t2? and see the resultandrey_sz
It does not seem to work. I tried different variations but nothing gave the correct outcome.Mathias
Did you try to join the tables?andrey_sz
I am not sure which join I should perform. I tried union join but I wasn't rightMathias

1 Answers

0
votes

Try this:

PROC SQL;
CREATE TABLE result AS
SELECT t1.*, t2.extra_column
FROM Dataset1 AS t1
INNER JOIN Dataset2 AS t2 ON (t1.ID = t2.ID)
;
QUIT;