2
votes

Hi I have a SAS character column called Hexchar contains hexadecimal character "0x1ba0c56c966ee8b0000" This is actually 8154350000000000000000 in decimal number. And I want to convert "0x1ba0c56c966ee8b0000" to 8154350000000000000000. If the hexadecimal is less than 16 bytes then the following function does the job.

DexNumber=input(substr(Hexchar,3), hex16.); /* The reason why substr() function is to skip "0x" in the front */

However hex format is limited only to 16 bytes and not handle more bytes. Do you know any good logic to convert this?

2
Try first converting the hex digits into the actual bytes they represent. binary=input(string,$hex.); and then convert the binary bytes into a number. I tried using number=input(binary,pib8.) but it did not give the value you say you want. Explain how to convert from the string you have to the number you want.Tom
SAS uses 8 byte IEEE floating point for numbers. You cannot store more than about 15 decimal digits of precision. What do you want to do with these values with more than 16 decimal digits?Tom
Hi Thanks. IEEE754 I am aware. This is from Ethereum blockchain information. As the minimum unit they can handle is from 10**(-18)ETH, (it's called WEI similar to SATOSHI in Bitcoin) the number tends to be super big. Blockchain itself does not require convert to decimal but analysis needs to use decimals and corresponding arithmetic manipulation.Kaz

2 Answers

2
votes

You will not be able to store such large integers precisely in a SAS numeric (64-bit double).

From https://v8doc.sas.com/sashtml/win/numvar.htm

Significant Digits and Largest Integer by Length for SAS Variables under Windows

| Length  | Largest Integer       | Exponential
| (Bytes) | Represented Exactly   | Notation 
+---------+-----------------------+------------
|    3    |                 8,192 | 2 ^ 13
|    4    |             2,097,152 | 2 ^ 21
|    5    |           536,870,912 | 2 ^ 29
|    6    |       137,438,953,472 | 2 ^ 37
|    7    |    35,184,372,088,832 | 2 ^ 45
|    8    | 9,007,199,254,740,992 | 2 ^ 53

So, while all 16-digit hex values in C can fit distinctly in a 64-bit integer, not all such can distinctly fit in SAS numeric (those > 2**53). The 64-bit double however is IEEE standard and can deal with numbers up to ~ 10**51 albeit with some loss of precision.

This code shows two different large integer assignments being stored as the same SAS numeric value different from what was assigned.

data _null_;
  x = 837886600861902976;
  y = 837886600861902977;
  put x= 20. / y= 20.;
  if x=y then put 'Huh? Ooohhh.';
run;
---------------------
x=837886600861902848
y=837886600861902848
Huh? Ooohhh.

One approach is to split the c-hex string into 'upper' and 'lower' parts and perform arithmetic with their inputs. The caveat is that there will be a loss of distinctness / precision when the actual hex value is > 2**53.

data _null_;
  length H $30;

  input H;

  put h=;

  if H =: '0x' then do;
    HX = substr(H,3); * skip over 0x;

    if (length (HX) > 15) then do;
      HX_lower = substr(HX, length(HX)-14);
      HX_upper = substr(HX, 1, length(HX)-15);
      if length (HX_lower) > 15 then put 'ERROR: ' H= 'too big for two parts';

      H_lower = input (HX_lower,hex15.);
      H_upper = input (HX_upper,hex15.);

      U = H_upper * 16**15;
      L = H_lower;

      put / H;
      put HX_upper $16. ' ->' H_upper 20. ' ->' U 25.;
      put HX_lower $16. ' ->' H_lower 20. ' ->' L 25.;

      H_num = U + L;
    end;
    else do;
      H_num = input (HX, hex15.);
    end;
  end;
  else
    put 'ERROR: ' H= 'is not c hex';

  put H= H_num= comma30.;
datalines;
0x1
0x1b
0x1ba
0x1ba0
0x1ba0c
0x1ba0c5
0x1ba0c56
0x1ba0c56c
0x1ba0c56c9
0x1ba0c56c96
0x1ba0c56c966
0x1ba0c56c966e
0x1ba0c56c966ee
0x1ba0c56c966ee8
0x1ba0c56c966ee8b
0x1ba0c56c966ee8b0
0x1ba0c56c966ee8b00
0x1ba0c56c966ee8b000
0x1ba0c56c966ee8b0000
magic!
run;
0
votes

By understanding those precision issues, the following is my own logic for the conversion.

data test;

length dum_s $24.; dum_s="1ba0c56c966ee8b0000";

do while(length(dum_s)<24);
    dum_s="0"||dum_s;
end;
    dum_dec1=input(substr(dum_s,1,8),hex8.);
    dum_dec2=input(substr(dum_s,9,8),hex8.);
    dum_dec3=input(substr(dum_s,17,8),hex8.);

    dum1=dum_dec1*16**16;
    dum2=dum_dec2*16**8;
    dum3=dum_dec3*16**0;
    dum=dum1+dum2+dum3;
    dum_ether=dum/10**18;

put dum_ether;
run;

Ethereum transaction hash 0x57662d92cb24943079dec7d83d3c39fc7a1ae958b22de58ba62c8c4cb425cab3 and it's value is in hexa basis "0x1ba0c56c966ee8b0000" and my logic above gives me 8154.35 ethereum (10**18 times to wei) which is same as the site. https://etherscan.io/tx/0x57662d92cb24943079dec7d83d3c39fc7a1ae958b22de58ba62c8c4cb425cab3 When the value becomes with floating points then there is some precision issues I do be aware.

Thanks!