3
votes

I am a tester and I need to compare two data sets structure (not table data) in SAS. I tried to use 'proc compare' but it compares the data. I want to compare dataset/table structure (column name, data type, null constraints etc.)

Can any one please help?

5

5 Answers

6
votes

You can interrogate the views in SASHELP (vtable, vcolumn etc) to do this. A quick way would be to create a temporary table from sashelp.vcolumn for each of the two tables you want to compare, then use a PROC SQL join to compare them. Then you'll be comparing the structures, which is represented in the data from vcolumn.

To get started with this, have a look at what's in SASHELP.vcolumn.

Here is a basic example of employing this method, to compare variables in 2 datasets.

* provide names of the two data sets here ;
%let ds1=TheFirstDataSet;
%let ds2=TheOtherDataSet;

* upcase the data set names ;
%let ds1=%sysfunc(upcase(&ds1));
%let ds2=%sysfunc(upcase(&ds2));

proc sql;
* retrieve info on these tables from sashelp.vcolumn;
  create table first as select * from sashelp.vcolumn where upcase(memname)="&ds1";
  create table second as select * from sashelp.vcolumn where upcase(memname)="&ds2";
* join these data sets and report on differences for var names;
  select coalescec(f.name,s.name) as varName
        ,case
          when f.name is null then "This var is in &ds2 only"
          when s.name is null then "This var is in &ds1 only"
          else 'This var is in both data sets'
          end as DiffDescription
  from 
    first as f
    full outer join 
      second as s 
      on f.name=s.name
  ;
quit;

You can generalise from this for other attributes such as data type, length, label etc., all of which are available in vcolumn.

  • Note that you may need to alter this code to accommodate librefs that your data sets may have.
3
votes

You can write the descriptor portions to data sets using proc contents then use proc compare to see how their structures differ. The out2 option will write out integrity constraints if they exist. If not the data set will be empty. Some columns like CRDATE (creation date) or LIBNAME or MEMNAME, may be expected to differ and so you may wish to exclude those from the comparison.

/* create some fake data similar to an existing one */
proc sql;
create table myclass as 
  select *, "foo" as newcol from sashelp.class
;
/* modify it */
  insert into myclass
     values ("George", "M", 17, 72, 169,"foo");
/* add an index */
  create index names on
     work.myclass(name, age);
quit;

/* write out descriptor portions to data sets */
proc contents data=myclass out=ds1 out2=ds2;run;
/* sashelp.class doesn't have an index so ds2a will not exist */
proc contents data=sashelp.class out=ds1a out2=ds2a;run;

/* compare data set structures */
proc compare data=ds1 compare=ds1a;run;
1
votes

1- Use PROC CONTENTS to get your dataset description (dataset name, variable name, variable label, variable type....)

2- PROC SORT all of the content output

3- Use a PROC COMPARE.

i.e

***********************************************;
proc content data=table1 out=cont1 noprint;
run;

proc content data=table2 out=cont2 noprint;
run;

proc sort data=cont1; 
  by memname name;
run;

proc sort data=cont2; 
  by memname name;
run;

proc compare listvar
  base=cont1
  compare=cont2;
  id memname name;
run;
******************* END ************;
0
votes

You can also compare SAS tables structures side-by-side visually using checklist tables.

In my recent blog post How to compare SAS data tables for common/uncommon columns, I show how to do this using the following code example:

data WORK.NEWCARS (drop=temp:);
   set SASHELP.CARS (rename=(Origin=Region EngineSize=temp1 Make=temp2));
   length EngineSize $3 Make $20;
   EngineSize = put(temp1,3.1);
   Make = temp2; 
   label Type='New Car Type';
run;

Then we can compare 2 datasets as follows:

proc contents data=SASHELP.CARS noprint out=DS1(keep=Name Type Length Label);
run;

proc contents data=WORK.NEWCARS noprint out=DS2(keep=Name Type Length Label);
run;

data comparison_matrix;
   merge
      DS1(in=in1 rename=(Type=Typ1 Length=Len1 Label=Lab1))
      DS2(in=in2 rename=(Type=Typ2 Length=Len2 Label=Lab2));
   by Name;

   /* set symbol shape: 1=V; 0=X */
   ds1 = 1; ds2 = 1;
   if in1 and not in2 then ds2 = 0; else
   if in2 and not in1 then ds1 = 0;

   /* add background color */
   if ds1=ds2=1 then
   select;
      when(Typ1^=Typ2) do; ds1=2; ds2=2; end;
      when(Len1^=Len2) do; ds1=3; ds2=3; end;
      when(Lab1^=Lab2) do; ds1=4; ds2=4; end;
      otherwise; 
   end;

   label
      Name = 'Column Name'
      ds1 = 'SASHELP.CARS'
      ds2 = 'WORK.NEWCARS'
      ;
run;

proc format;
   value chmark
      0   = '(*ESC*){unicode "2718"x}'
      1-4 = '(*ESC*){unicode "2714"x}'
      ;
   value chcolor
      0   = red
      1-4 = green
      ;
   value bgcolor
      2 = 'cxffccbb'
      3 = 'cxffe177'
      4 = 'cxd4f8d4' 
      ;
run;

ods html path='c:\temp' file='comp_marix.html' style=Seaside;
ods escapechar='^';
title 'Data set columns comparison matrix';

proc odstext;
   p '<div align="center">Mismatch Legend:'||
     '<span style="background-color:#ffccbb;margin-left:17px">^_^_^_^_</span> Type'||
     '<span style="background-color:#ffe177;margin-left:17px">^_^_^_^_</span> Length'||
     '<span style="background-color:#d4f8d4;margin-left:17px">^_^_^_^_</span> Label</div>'
   / style=[fontsize=9pt];
run;

title; 
proc print data=comparison_matrix label noobs;
   var Name / style={fontweight=bold width=100px};
   var ds1 ds2 / style={color=chcolor. backgroundcolor=bgcolor. just=center fontweight=bold width=120px};
   format ds1 ds2 chmark.;
run;

ods html close;

The result of running this code will be the following checklist table: enter image description here

For additional details see the blog post How to compare SAS data tables for common/uncommon columns

0
votes

You can use PROC COMPARE, just use the OBS=0 dataset option on each input dataset so there is no data to compare.

proc compare data=old(obs=0) compare=new(obs=0);
run;