0
votes

I want to reduce the number of characters used to represent a very big number.

For example. Take this number 4593638492...3837292 (1000+ characters long)

What techniques or tricks can I use to represent this number using fewer characters (say around 50 characters or even less)

Thanks in advance :)

2
Perhaps by using a higher numerical base than base 10, such as hexadecimal (base 16). This reduces the number of digits since each digit's place value in base 16 is greater than it would be in base 10 (so you need a smaller number of digits to represent the same number)).Telescope
Representing a 1000-decimal-digit number in 50 characters would require an alphabet of 100,000,000,000,000,000,000 different characters, which is stretching the bounds of reality, since no-one could learn how to read such an alphabet. What is it that you are really trying to accomplish?rici
@rici: 10^20 nearly fits in a 64 bits integer (1000 digits would take 52 such integers). This is a quite acceptable numeration base. I would not be surprised that it is effectively in use in some BigNum implementation.Yves Daoust
@yves: indeed, GMP uses 64-bit limbs on many platforms. But that doesn't save (much) memory. (Python's native bignums use 15- or 30-bit limbs, which makes them about 6% bigger.) But I don't believe that's what OP is talking about. They use the word "character", which in normal discourse would not be applied to an alphabet of 2^64 possibilities, and there is no evidence that they meant anything else. "Representing the number using fewer characters" is certainly not the same as "storing the number with a different limb size".rici
How are these big numbers obtained, how important is is to exactly represent the number precisely vs approximately, and what is the use case? If this is for human readability, you can give the number a name and define it once. "Dickson's number" or whatever.Dave

2 Answers

2
votes

You can't. (Unless your numbers were created from a simple rule, such as, say, the factorial of an integer.)

Characters are represented on 8 bits (one byte) and a number of 1000 digits holds 3322 bits of information. At best you can pack as 416 characters. (Or 208 characters of the wide type, but that brings no compression.)


Think of this: with three-digits numbers, you can represent 1000 distinct numbers (from 000 to 999). This does not allow you to encode arbitrary four-digits numbers, as there are 10000 of these. Unless, for some reason, there are less than 1000 four-digits numbers that need to be encoded, and that subset has a computable definition (e.g. only the odd multiples of 7 or only primes).

1
votes

First, determine if there's a simple mathematical expression to give the number. Maybe a factorization. For example, it takes less space to write 2^64*3^40 than 224269343257001716702690972139746492416.

Otherwise, as Telescope suggested in their comment, use a higher numerical base. In general, writing the integer n in base b takes approximately (log n)/(log b) characters.

Base 36 is a convenient choice, with case-insensitive alphanumeric numerals, which use 64.25% of the space of the base-10 equivalent.

If you want to stick to ASCII digits, try Base 94, using the printable ASCII characters from 33 to 126. Base 94 numerals are 50.68% of the length of the base-ten equivalent.

If you can use Unicode, maybe pick a block of 1000 Kanji characters to use as your "digits". That takes only 1/3 of the space of base-ten. Or if you can find 10,000 usable "digits", you're down to only 1/4 of the space.