0
votes

I have a following datasets in CSV Format and want to generate a merged data set. the dataset 1 be like:

Name Age,
Casper 12,
Jasper 13,
Master 14,
Clapper 15

and the dataset 2 be like:

Name Age1,
Casper 13,
Jasper 14,
Master 15,
Clapper 16

I want to merge these 2 data sets and create a 3rd dataset like this:

 Name   Age Age1
Casper   12  13
Jasper   13  14
Master   14  15
Clapper  15  16

PS - this is not any assignment though. was doing a regular merge on SAS and strike with a random query. Tried hard to find something relevant on web.

NOTE: The dataset 1 and 2 have only single cells. That is one cell consist of 2 entries that is Name and Age are in Single cell.

2

2 Answers

1
votes

Read in the files in the same way - take the line input, separate the name and the age using scan and compress out the comma if necessary. Input the name to a numeric.

data DS1;
infile cards firstobs=2 truncover;
length NAME $10. AGE 3.;
input;
NAME=scan(_infile_,1,,'s');
AGE=input(compress(scan(_infile_,2,,'s'),","),8.);
cards;
Name Age,
Casper 12,
Jasper 13,
Master 14,
Clapper 15
;
run;

data DS2;
infile cards firstobs=2 truncover;
length NAME $10. AGE 3.;
input;
NAME=scan(_infile_,1,,'s');
AGE=input(compress(scan(_infile_,2,,'s'),","),8.);
cards;
Name Age,
Casper 13,
Jasper 14,
Master 15,
Clapper 16
;
run;

Then from here you could either sort the steps and do a dataset merge or do a SQL join:

proc sort data=DS1; by NAME; run;
proc sort data=DS2; by NAME; run;

data MERGED;
merge DS1 DS2(rename=(AGE=AGE1));
by NAME;
run;

OR SQL:

    proc sql;
    create table MERGED2 as
    select coalesce(t1.NAME,t2.NAME) as NAME
                , t1.AGE as AGE
                , t2.AGE as AGE1
    from
        DS1 t1
    full join
        DS2 t2
        on t1.NAME=t2.NAME;
    quit;
0
votes

Hi after you read your data into sas try this:

proc sql;
 create table dataset3 as 
 select a.*, b.age1
 from dataset1 as a, dataset2 as b
 where a.name=b.name;
quit;

I like to merge my datasets with proc sql. The create table is making a new dataset, in the select statement a.* is selecting all the variables from dataset1 (which I named 'a' as a reference) and age1 from dataset2 (as 'b'). I am merging them on name. See if that works. If not let me know!

-Mike