1
votes

I'm using numpy to do linear algebra. I want to do fast subset-indexed dot and other linear operations.

When dealing with big matrices, slicing solution like A[:,subset].dot(x[subset]) may be longer than doing the multiplication on the full matrix.

A = np.random.randn(1000,10000)
x = np.random.randn(10000,1)
subset = np.sort(np.random.randint(0,10000,500))

Timings show that sub-indexing can be faster when columns are in one block.

%timeit A.dot(x)
100 loops, best of 3: 4.19 ms per loop

%timeit A[:,subset].dot(x[subset])
100 loops, best of 3: 7.36 ms per loop

%timeit A[:,:500].dot(x[:500])
1000 loops, best of 3: 1.75 ms per loop

Still the acceleration is not what I would expect (20x faster!).

Does anyone know an idea of a library/module that allow these kind of fast operation through numpy or scipy?

For now on I'm using cython to code a fast column-indexed dot product through the cblas library. But for more complex operation (pseudo-inverse, or subindexed least square solving) I'm not shure to reach good acceleration.

Thanks!

1
Why do you expect 20x speedup? - hpaulj
Naively, computing the result for 10000 columns should be 20x faster than computing 500. - Tonio Bonnef

1 Answers

0
votes

Well, this is faster.

%timeit A.dot(x)
#4.67 ms

%%timeit
y = numpy.zeros_like(x)
y[subset]=x[subset]
d = A.dot(y)
#4.77ms

%timeit c = A[:,subset].dot(x[subset])
#7.21ms

And you have all(d-ravel(c)==0) == True.

Notice that how fast this is depends on the input. With subset = array([1,2,3]) you have that the time of my solution is pretty much the same, while the timing of the last solution is 46micro seconds.

Basically this will be faster if the size ofsubset is not much smaller than the size of x