Converting numpy array to pure python integer to avoid integer overflow

Question

I have asked this question before and being downvoted heavily. Anyway judging by the fact that noone really sees a triple downvoted question again I repost it to make clear that I am interested in the actual answer (if there is one).

Problem statement:

I am in a situation I need the arbitrary precision feature of pure python integers. At some point in my code I have a numpy array with boolean. Something like:

arr

array([ True, False, False, False, True, True, True, False, True, True, False, False, True, True, True, False, True, False, False, True, False, True, True, True, True, True, False, True, False, True, True, False, True, True, False, True, False, False, True, False, True, True, False, True, False, True, True, False, True, True, True, False, False, False, True, False, False, True, True, True, True, False, True, False])

which I convert it to numpy.int64 using arr.astype(int) to make it arithmetic.

But I used this code to convert it to an integer it overflowed (and produced negative numbers I don't want to).

Code is using this function (which is pure python and wont have any integer overflow issue by itself):

def bool2int(x):
    y = 0
    for i,j in enumerate(x):
        y += j<<i
    return y

If I run the code directly on np.array (converted to int or not does not matter):

bool2int(arr)

-2393826705255337647

bool2int(h.astype(int))

-2393826705255337647

will I need a positive integer. So, I used a list comprehension:

bool2int([int(x) for x in arr])

16052917368454213969

Obviously, the number represented by arr exceeds the capacity of fixed precision integers (i.e. 2⁶³-1) to be able to use ti directly.

Is there any other direct way to achieve beyond list comprehension?

Edit:

For the theory of integer overflow in python I sued this source.

I understand why you're re-posting, but editing the original question or adding a bounty (when the time comes) would be better practise I think — Chris_Rands
@roganjosh: the overflow is not due to the conversion, but the bool2int, since by converting it to int, you get a numpy.int64, not a vanilla int. — Willem Van Onsem
@WillemVanOnsem, yes that's what I am telling. I want the array as list of vanilla int — Eypros
@WillemVanOnsem ok, so its a library method. I missed the point of the question then, sorry. The Boolean array distracted me. — roganjosh
You lose me at the point in your question where you say you have converted the array with the astype method but then still call bool2int. Why? — timgeb

Paul Panzer Paul Panzer · Accepted Answer · 2018-10-16T14:27:11

One way of getting native Python type elements is .tolist(). Note that we can do this directly on the boolean array. Your code works fine with native Python bools.

>>> x = np.random.randint(0, 2, (100,)).astype(bool)
>>> x
array([ True,  True, False,  True, False,  True, False, False,  True,
       False, False,  True,  True, False, False, False,  True, False,
       False,  True, False,  True, False, False,  True,  True,  True,
        True,  True,  True,  True, False, False, False, False, False,
        True,  True,  True,  True, False, False,  True, False, False,
       False, False,  True, False,  True,  True, False, False,  True,
       False,  True,  True,  True, False,  True,  True,  True, False,
        True,  True,  True,  True, False,  True,  True,  True, False,
        True, False,  True, False,  True, False,  True,  True,  True,
       False, False,  True,  True,  True,  True,  True, False, False,
        True, False, False, False,  True,  True,  True, False, False,  True], dtype=bool)
>>> bool2int(x)
-4925102932063228254
>>> bool2int(x.tolist())
774014555155191751582008547627L

As an added bonus it's actually faster.

>>> timeit(lambda:bool2int(x), number=1000)
0.24346303939819336
>>> timeit(lambda:bool2int(x.tolist()), number=1000)
0.010725975036621094

Converting numpy array to pure python integer to avoid integer overflow

2 Answers