2
votes

I have asked this question before and being downvoted heavily. Anyway judging by the fact that noone really sees a triple downvoted question again I repost it to make clear that I am interested in the actual answer (if there is one).

Problem statement:

I am in a situation I need the arbitrary precision feature of pure python integers. At some point in my code I have a numpy array with boolean. Something like:

arr

array([ True, False, False, False, True, True, True, False, True, True, False, False, True, True, True, False, True, False, False, True, False, True, True, True, True, True, False, True, False, True, True, False, True, True, False, True, False, False, True, False, True, True, False, True, False, True, True, False, True, True, True, False, False, False, True, False, False, True, True, True, True, False, True, False])

which I convert it to numpy.int64 using arr.astype(int) to make it arithmetic.

But I used this code to convert it to an integer it overflowed (and produced negative numbers I don't want to).

Code is using this function (which is pure python and wont have any integer overflow issue by itself):

def bool2int(x):
    y = 0
    for i,j in enumerate(x):
        y += j<<i
    return y

If I run the code directly on np.array (converted to int or not does not matter):

bool2int(arr)

-2393826705255337647

bool2int(h.astype(int))

-2393826705255337647

will I need a positive integer. So, I used a list comprehension:

bool2int([int(x) for x in arr])

16052917368454213969

Obviously, the number represented by arr exceeds the capacity of fixed precision integers (i.e. 263-1) to be able to use ti directly.

Is there any other direct way to achieve beyond list comprehension?

Edit:

For the theory of integer overflow in python I sued this source.

2
I understand why you're re-posting, but editing the original question or adding a bounty (when the time comes) would be better practise I think - Chris_Rands
@roganjosh: the overflow is not due to the conversion, but the bool2int, since by converting it to int, you get a numpy.int64, not a vanilla int. - Willem Van Onsem
@WillemVanOnsem, yes that's what I am telling. I want the array as list of vanilla int - Eypros
@WillemVanOnsem ok, so its a library method. I missed the point of the question then, sorry. The Boolean array distracted me. - roganjosh
You lose me at the point in your question where you say you have converted the array with the astype method but then still call bool2int. Why? - timgeb

2 Answers

2
votes

One way of getting native Python type elements is .tolist(). Note that we can do this directly on the boolean array. Your code works fine with native Python bools.

>>> x = np.random.randint(0, 2, (100,)).astype(bool)
>>> x
array([ True,  True, False,  True, False,  True, False, False,  True,
       False, False,  True,  True, False, False, False,  True, False,
       False,  True, False,  True, False, False,  True,  True,  True,
        True,  True,  True,  True, False, False, False, False, False,
        True,  True,  True,  True, False, False,  True, False, False,
       False, False,  True, False,  True,  True, False, False,  True,
       False,  True,  True,  True, False,  True,  True,  True, False,
        True,  True,  True,  True, False,  True,  True,  True, False,
        True, False,  True, False,  True, False,  True,  True,  True,
       False, False,  True,  True,  True,  True,  True, False, False,
        True, False, False, False,  True,  True,  True, False, False,  True], dtype=bool)
>>> bool2int(x)
-4925102932063228254
>>> bool2int(x.tolist())
774014555155191751582008547627L

As an added bonus it's actually faster.

>>> timeit(lambda:bool2int(x), number=1000)
0.24346303939819336
>>> timeit(lambda:bool2int(x.tolist()), number=1000)
0.010725975036621094
2
votes

Using astype(int) seems to be working fine; the following code:

import numpy as np

test = np.array([True, False, False, False, True, True, True, False, True, True, False, False, True, True, True, False, True, False, False, True, False, True, True, True, True, True, False, True, False, True, True, False, True, True, False, True, False, False, True, False, True, True, False, True, False, True, True, False, True, True, True, False, False, False, True, False, False, True, True, True, True, False, True, False])
test_int = test.astype(int)

print(test_int)
print(test_int.sum())

Returns:

[1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 0 1 0 0 1 0 1 1 1 1 1 0 1 0 1 1 0
1 1 0 1 0 0 1 0 1 1 0 1 0 1 1 0 1 1 1 0 0 0 1 0 0 1 1 1 1 0 1 0]

37

The overflow exception you are getting seems unlikely here so I would look again into that because maybe you had an error somewhere else.

Edit

If you want to get a Python type instead of a numpy object just do:

test.astype(int).tolist()