3
votes

Excuse me for my ignorance.

If numpy provides vectorized operations that make computation faster, how is that for data type conversion pure python is almost 8 times faster?

e.g

a = np.random.randint(0,500,100).astype(str)
b = np.random.randint(0,500,100).astype(str)
c = np.random.randint(0,500,100).astype(str)

def A(a,b,c):
    for i,j,k in zip(a,b,c):
        d,e,f = int(i), int(j), int(k)
        r = d+e-f
    return 

def B(a,b,c):
    for i,j,k in zip(a,b,c):
        d,e,f  = np.array([i,j,k]).astype(int)
        r = d+e-f
    return 

Then,

%%timeit 
A(a,b,c)

249 µs ± 3.13 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%%timeit
B(a,b,c)

1.87 ms ± 4.08 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Thank you, Ariel

1
This is so so so wrong. I'll explain when I get to a desktop if someone else doesn't jump in.Mad Physicist
You're comparing apples to tennis balls hereMad Physicist

1 Answers

8
votes

Yes, NumPy does provide vectorized operations that make computations faster than vanilla Python code. However, you aren't using them.

NumPy is intended to perform operations across entire datasets, not many repeated operations across chunks a dataset. The latter causes iteration to be done at the Python level, which will increase runtime.

Your primary issue is that the only "vectorized" operation you are using is astype, but you're applying it to three elements at a time, and still looping just as much as the naive Python solution. Combine that with the fact that you incur additional overhead from creating numpy arrays at each iteration of your loop, it's no wonder your attempt with numpy is slower.

On tiny datasets, Python can be faster, since NumPy has overhead from creating arrays, passing objects to and from lower-level libraries, etc.. Let's take a look at the casting operation you are using on three elements at a time:

%timeit np.array(['1', '2', '3']).astype(int)
5.25 µs ± 89.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.array(['1', '2', '3'])
1.62 µs ± 42.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Over a quarter of the runtime is just from allocating the array! Compare this to your pure Python version:

%timeit a, b, c = int('1'), int('2'), int('3')
659 ns ± 50.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

So if you operate only on chunks of this size, Python will beat NumPy.


But you have many more elements than just three, so NumPy can be used to speed up your code substantially, but you need to change your mindset about how you approach the problem. Instead of focusing on how the operation gets applied to individual scalars, think about how it gets applied to arrays.


To vectorize this problem, the general idea is:

  • Create a single array containing all your values
  • Convert the entire array to int with a single astype call.
  • Take advance of elementwise operations to apply your desired arithmetic to the array.

It ends up looking like this:

def vectorized(a, b, c):
    u = np.array([a, b, c]).astype(int)
    return u[0] + u[1] - u[2]

Once you compare two approaches where NumPy is being used correctly, you will start to see large performance increases.

def python_loop(a, b, c):
    out = []
    for i,j,k in zip(a,b,c):
        d,e,f = int(i), int(j), int(k)
        out.append(d+e-f)
    return out

a, b, c = np.random.randint(0, 500, (3, 100_000)).astype(str)

In [255]: %timeit vectorized(a, b, c)
181 ms ± 6.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [256]: %timeit python_loop(a, b, c)
206 ms ± 7.97 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> np.array_equal(python_loop(a, b, c), vectorized(a, b, c))
True

Converting from strings to integers is not something that NumPy will do that much faster than pure Python, as you can see from the timings, the two are fairly close. However, by applying a vectorized approach, the comparison is at least much fairer.