2
votes

I am performing a basic proc compare:

PROC COMPARE BASE=dset1
           COMPARE=dset2 LISTALL;
  ID description;
RUN;

I am getting differences for some numeric variables when the values in the two datasets are exactly the same.

For example,

         dset1.variable1 = 1.0988718715
         dset2.variable1 = 1.0988718715

The proc compare has the following displayed for variable1:

Base       Compare    Diff.
1.0989     1.0989     -1.07E-13

I removed all formats and informats from the base and compare datasets and the length of variable1 is the same in both datasets.

Why is there a difference when the value is exactly the same?

1

1 Answers

1
votes

The numerical precission on the computer is not perfect, that is why proc compare can report tiny differences on "identical" numbers.

There are a couple of options you can use to ignore these small differences (taken from documentation): METHOD=ABSOLUTE | EXACT | PERCENT | RELATIVE <(delta)> specifies the method for judging the equality of numeric values. The constant (delta) is a number between 0 and 1 that specifies a value to add to the denominator when calculating the equality measure. By default, is 0. Unless you use the CRITERION= option, the default method is EXACT. If you use the CRITERION= option, then the default method is RELATIVE( ), where (phi) is a small number that depends on the numerical precision of the computer on which SAS is running and on the value of CRITERION=.

FUZZ=number alters the values comparison results for numbers less than number. PROC COMPARE prints:

0 for any variable value that is less than number.

a blank for difference or percent difference if it is less than number

0 for any summary statistic that is less than number.

Default 0 Range: 0 - 1 Tip: A report that contains many trivial differences is easier to read in this form. http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000146741.htm