Creating concatenative summary data in SAS (PROC SQL)

Question

I have a SAS dataset (change_detail) that maintains change variables and looks like this (except there are over 200 change variables in the actual table):

Reference Chg1Code Chg1Desc Chg2Code Chg2Desc Chg3Code Chg3Desc
--------- -------- -------- -------- -------- -------- --------
1234                        C        Change2
1234      C        Change1  C        Change2
1234      C        Change1                    C        Change3
2345      C        Change1                    C        Change3
2345                                          C        Change3
3456                        C        Change2  C        Change3
3456      C        Change1  C        Change2
3456      C        Change1
3456      C        Change1                    C        Change3

I need to distill this information into a summary of the changes that have occurred to each reference number. When I used the following PROC SQL in SAS:

proc sql;
create table work.changedata_summary as 
select distinct 
      a.Reference, 
      catx(',',a.Chg1Desc,a.Chg2Desc,a.Chg3Desc) as Changes length=1000
from
      work.change_detail a
;
quit;

It concatenates the individual Change Description variables, but maintains the original number of observations per Reference Number. The results I get are:

Reference Changes
--------- -------
1234      Change2
1234      Change1,Change2
1234      Change1,Change3
2345      Change1,Change3
2345      Change3
3456      Change2,Change3
3456      Change1,Change2
3456      Change1
3456      Change1,Change3

I need some help in flattening this data out (using either PROC SQL or a DATA step) to show just the summary results by distinct Reference number. The expected result is:

Reference Changes
--------- -----------------------
1234      Change1,Change2,Change3
2345      Change1,Change3
3456      Change1,Change2,Change3

Any help provided will be appreciated! Thanks!

Can you have only 3 changes or is this just a small sample of your data? — Reeza
This is just a small sample. The actual dataset contains change data on over 200 different variables, so I need a scalable solution. — Brian S

Kiran Kiran · Accepted Answer · 2018-08-20T16:11:59

if your data is something like this then adding a group by and max to your code should work

  proc sql;
  create table work.changedata_summary as 
  select  
        Reference, 
  catx(',',max(chg1Desc),max(Chg2Desc), max(Chg3Desc)) as Changes length=1000
  from
  have 
  group by Reference
  ;
  quit;

Creating concatenative summary data in SAS (PROC SQL)

3 Answers