12
votes

Consider the example:

>>> from sys import maxint
>>> type(maxint)
<type 'int'>
>>> print maxint
9223372036854775807
>>> type(maxint+2)
<type 'long'>
>>> print maxint+2
9223372036854775809
>>> type((maxint+2)+maxint)
<type 'long'>
>>> print ((maxint+2)+maxint)
18446744073709551616

Python will autopromote from an int, which in this case is a 64 bit integer value (OS X, python 2.6.1) to a python long integer which is of arbitrary precision. Even though the types are not the same, they are similar and Python allows the usual numeric operators to be used. Usually this is helpful, for example to be able to use code that expects 64-bit values on a 32-bit machine.

However, the arbitrary precision operations are vastly slower than the native int operations. For example:

>>> print maxint**maxint # execution so long it is essentially a crash

Is there a way to defeat or disallow the auto-promotion of a Python int to a Python long?

6
maxint**maxint is a number with >>750 decimals, i hope you're not really suprised it takes a while. Also what is supposed to happen when a number does not fit into 32 bit? - Jochen Ritzel
You're saying basic mathematically operations is making your application run hours longer it otherwise would? That sounds like your error, not python's - Falmarri
Also, what should happen instead of autopromotion? Segfault? Sounds like you should keep your numbers below sys.maxint... - Falmarri
Raising OverflowError is perfectly reasonable alternative behavior to converting to a long, though there's no mechanism in the language to do this (eg. there's probably no sane to do it for just your code and not libraries, which means it'd break things). - Glenn Maynard
@Glenn Maynard: That is really the answer I was looking for. Is there a 'pythonic' way that me as a newby could not see or Google that change the default behavior of an int. Numpy is probably closest, but lots of code to change to get there. - dawg

6 Answers

5
votes

So you want to throw out the One True Way and go retro on overflows. Silly you.

There is no good upside to the C / C++ / C# / Java style of overflow. It does not reliably raise an error condition. For C and C99 it is "undefined behavior" in ANSI and POSIX (C++ mandates modulo return) and it is a known security risk. Why do you want this?

The Python method of seamlessly overflowing to a long is the better way. I believe this is the same behavior being adapted by Perl 6.

You can use the Decimal module to get more finite overflows:

>>> from decimal import *
>>> from sys import maxint
>>> getcontext()
Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999, capitals=1,
flags=[], traps=[DivisionByZero, Overflow, InvalidOperation])

>>> d=Decimal(maxint)
>>> d
Decimal('9223372036854775807')
>>> e=Decimal(maxint)
>>> f=d**e
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/decimal.py", line 2225, in __pow__
    ans = ans._fix(context)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/decimal.py", line 1589, in _fix
    return context._raise_error(Overflow, 'above Emax', self._sign)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/decimal.py", line 3680, in _raise_error
    raise error(explanation)
decimal.Overflow: above Emax

You can set your precision and boundary conditions with Decimal classes and the overflow is nearly immediate. You can set what you trap for. You can set your max and min. Really -- How does it get better than this? (I don't know about relative speed to be honest, but I suspect it is faster than numby but slower than native ints obviously...)

For your specific issue of image processing, this sounds like a natural application to consider some form of saturation arithmetic. You also might consider, if you are having overflows on 32 arithmetic, check operands along the way on obvious cases: pow, **, *. You might consider overloaded operators and check for the conditions you don't want.

If Decimal, saturation, or overloaded operators don't work -- you can write an extension. Heaven help you if you want to throw out the Python way of overflow to go retro...

5
votes

If you want arithmetic overflows to overflow within e.g. 32 bits, you could use e.g. numpy.uint32.

That gives you a warning when an overflow occurs.

>>> import numpy
>>> numpy.uint32(2**32-3) + numpy.uint32(5)
Warning: overflow encountered in ulong_scalars
2

I tested its speed though:

>\python26\python.exe -m timeit "2**16 + 2**2"
1000000 loops, best of 3: 0.118 usec per loop

>\python26\python.exe -m timeit "2**67 + 2**65"
1000000 loops, best of 3: 0.234 usec per loop

>\python26\python.exe -m timeit -s "import numpy; numpy.seterr('ignore')" "numpy.uint32(2)**numpy.uint32(67) + numpy.uint32(2)**numpy.uint32(65)"
10000 loops, best of 3: 34.7 usec per loop

It's not looking good for speed.

2
votes

You can force your values to return to normal ints if you include an num = int(num) occasionally in your algorithm. If the value is long but fits in a native int, it will demote down to int. If the value doesn't fit in a native int, it will remain a long.

1
votes

Int vs long is an historical legacy - in python 3, every int is a "long". If your script speed is limited by int computation, it is likely that you are doing it wrong.

To give you a proper answer, we need more information on what are you trying to do.

1
votes

Well, if you don't care about accuracy you could all of your math ops modulo maxint.

1
votes

I don't know if it would be faster, neccesarily, but you could use numpy arrays of one element instead of ints.

If the specific calculation you are concerned about is integer exponentiation, then there are some inferences we can draw:

def smart_pow(mantissa, exponent, limit=int(math.ceil(math.log(sys.maxint)/math.log(2)))):
    if mantissa in (0, 1):
        return mantissa
    if exponent > limit:
        if mantissa == -1: 
            return -1 if exponent&1 else 1
        if mantissa > 1:
            return sys.maxint
        else: 
            return (-1-sys.maxint) if exponent&1 else sys.maxint
    else: # this *might* overflow, but at least it won't take long
        return mantissa ** exponent