0
votes

Scenario: I need to merge two SAS datasets together into one and maintain any non-matching values.

DatasetA contains: CustomerID, HouseholdID, EmailAddress, etc. DatasetB contains: HouseholdID

DatasetA:
CustomerID     HouseholdID   EmailAddress  ...
1001            100001         [email protected]
1002            100002         [email protected] 
2003            100003         [email protected]

DatasetB:
HouseholdID
100003 
100001          

I want to merge these two together, but obviously if I do that as is, I'll lose track of DatasetB.

To expand, DatasetA contains ALL customer information, and DatasetB only contains certain HouseholdID's (not all of them). I want to create a table that looks like this by merging. So, I think I need to duplicate HouseholdID in DatasetB, but how can I do this?

Desired Output (DatasetC)
CustomerID    HouseholdID   EmailAddress ... DatasetBHouseholdID
1001          100001        [email protected]       100001
1002          100002        [email protected]    .
2003          100003        [email protected] 100003
...

I tried to do this:

data data.datasetc;
merge data.dataseta data.datasetb (RENAME=(householdID=datasetbhouseholdID));
by householdid;
run;

But that resulted in an error of course (BY variable householdID is not on input dataset data.datasetb).

I think the best course of action is to make DatasetB look like this:

DatasetB:
HouseholdID     DatasetBHouseholdID
100003            100003
100001            100001

But how can I do that?

1

1 Answers

0
votes

Use the IN= dataset option.

data data.datasetc;
  merge data.dataseta data.datasetb (in=inB);
  by householdid;
  if inB then datasetbhouseholdID=householdID;
run;