1
votes

I have a column in my SAS dataset that has a 20 digit ID. When I attempt to filter out some records with PROC SQL in SAS EG, the field with the ID truncates to 1E18 and the export gives me the truncated value in each field instead of the actual value.

I attempted to create a data setp and define the field value but this doesn't have any impact.

Here's my sample code

proc sql; create table a01 as select a.id, a.value from b; quit;

data a02;
set a01;
id = put(id,$30.);
run;

Still doesnt seem to work.

Does anyone have an idea what is it that I am doing incorrectly.

2

2 Answers

3
votes

[Assuming you are using SAS on windows - the story would'nt be too different on *nix OS] The biggest integer that can be represented accurately under SAS is 9,007,199,254,740,992 or 2^53 or about 16 digits long - in most cases credit card numbers which are generally 15-16 digits long can be stored as numeric variables in SAS. But, as you mention you have a 20-digit integer it results in the value losing some precision and getting converted into a scientific notation.

The problem is not PROC SQL - the problem is SAS's smaller limit on representing big integers. Most other databases will represent upto 2^64 accurately.See SAS documentation

1
votes

While Sashikanth is correct that SAS (etc.) cannot store a numeric above ~15 digits safely (most 16 digit numbers, but not all), that's not the entire issue here. (SAS has no true concept of 'integer', all numbers are floating point numbers, hence only storing up to 2**53 at most.)

data a02;
set a01;
id = put(id,$30.);
run;

This incorrectly uses the $30. format, which is a character format. In put, the format type is driven by the first argument to put, not the resulting type (which is always character - put generates characters, input generates numbers). So $30. would only be appropriate if id were a character variable. Since it seems like it is a numeric variable, it needs to be:

id2 = put(id,30.);

Note it is id2 on the left hand side - you could not convert a numeric id to character id in the same data step, they must have different names.

It's also possible that your number was converted to 1E18 beforehand (such as by Excel, if it passed through there - Excel couldn't store that number precisely either); the data step you posted should have thrown several warnings about character to/from numeric conversion if id was a numeric variable.