SAS/SQL: Combine two columns while retaining others

Question

I need to merge two data sets. Each data set contains a sequential observation number. The first data set contains only the first observation. The second data set contains all subsequent observations. Not all subjects have the same number of observations.

The problem is as follows. There are two different types of subject. The type is contained only in the first data set. When I merge the two data sets together, the type is missing on all observations but the first for each subject. Please see my example below.

I would like to know how to do this with both SQL and a DATA step. My real data sets are not large, so efficiency of processing is not major a concern.

I have tried using RETAIN, but as the second data set doesn't contain the TYPE variable, there is no value to retain. Regarding SQL, it seems like UNION should work, and there are countless examples of UNION on the internet, but they all involve a single variable. I need to know how to union the Observation variable by ID while retaining the Amount and assigning the Type.

Example

data set1;
  input ID $ 
        Observation 
        Type $ 
        Amount 
        ;

  datalines;
  002 1 A 15
  026 1 A 30
  031 1 B 7
  028 1 B 10
  036 1 A 22
  ;
run;

data set2;
  input ID $ 
        Observation 
        Amount 
        ;

  datalines;
  002 2 11
  002 3 35
  002 4 13
  002 5 12
  026 2 21
  026 3 12
  026 4 40
  031 2 11
  028 2 27
  036 2 10
  036 3 15
  036 4 16
  036 5 12
  036 6 20
  ;
run;

proc sort data = set1;
  by  ID
      Observation
      ;
run;

proc sort data = set2;
  by  ID
      Observation
      ; 
run;

data merged;
  merge set1
        set2
        ;
  by  ID
      Observation
      ;
run;

This gives

                      ID     Observation    Type    Amount

                      002         1          A        15
                      002         2                   11
                      002         3                   35
                      002         4                   13
                      002         5                   12
                      026         1          A        30
                      026         2                   21
                      026         3                   12
                      026         4                   40
                      028         1          B        10
                      028         2                   27
                      031         1          B         7
                      031         2                   11
                      036         1          A        22
                      036         2                   10
                      036         3                   15
                      036         4                   16
                      036         5                   12
                      036         6                   20

However, what I need is

                      ID     Observation    Type    Amount

                      002         1          A        15
                      002         2          A        11
                      002         3          A        35
                      002         4          A        13
                      002         5          A        12
                      026         1          A        30
                      026         2          A        21
                      026         3          A        12
                      026         4          A        40
                      028         1          B        10
                      028         2          B        27
                      031         1          B         7
                      031         2          B        11
                      036         1          A        22
                      036         2          A        10
                      036         3          A        15
                      036         4          A        16
                      036         5          A        12
                      036         6          A        20

JJFord3 JJFord3 · Accepted Answer · 2016-11-23T17:45:18

I'm sure there are other ways to do it, but this is how I'd do it.

First, stack the data keeping only the common fields.

data new;
set set1 (drop = TYPE) set2;
run;

Then merge the type field back over.

proc sql;
create table new2 as select
a.*,
b.TYPE
from new a
left join set1 b
on a.id=b.id;
quit;

SAS/SQL: Combine two columns while retaining others

3 Answers