2
votes

I need to do some analysis on a large dataset using NumPy.

I have:

  1. One 50X1 Matrix (eigenvalues)
  2. One 50x50 Matrix (eigenvectors)

I need to be able to take each element of the eigenvalue matrix and multiply by column corresponding to the eigenvector.

So, multiply i-th element of array one, by i-th column of array 2, and so on for all i's.

Any ideas? :/

2
Maybe use numpy's multiply?Assaf Lavie

2 Answers

1
votes

First convert 1-dim eigenvalue vector into diagonal matrix. Then, apply matrix multiplication.

import numpy as np
eigenval_diag = np.diag(eigenvalue_vec) # 50x50 matrix
result = eigenval_diag * eigen_matrix # 50x50 matrix
1
votes

You can do this using the numpy broadcasting rules:

n = 4
A = np.random.randint(0, 10, size=(n,n))
B = np.array([1,0,2, 0])
B = B.reshape((1,n))
C = B * A

The multiplication is between a (1, n) and (n, n) matrix. To satisfy the broadcasting rule, the B matrix will be "extended" to a (n, n) array before the multiplication, which is then performed element-by-element as usual.

The above multiplication is equivalent to

BB = np.array([[1,0,2, 0],
               [1,0,2, 0],
               [1,0,2, 0],
               [1,0,2, 0]])
C = BB * A

but you never have to construct the matrix BB in memory.

Edit: Benchmarks

Since using a diagonal matrix might seem easier to read, I present the following quick benchmark you can try yourself.

# Setup data
n = 50
A = np.random.normal(size=(n,n))
B = np.random.normal(size=n)
B1 = B.reshape(1, 3)

# Make sure results are the same
C = np.dot(A, np.diag(B))
C1 = B1 * A
print np.allclose(C, C1) # Should be 'True'

# Bench with IPython
>>> %timeit np.dot(A, np.diag(B))
The slowest run took 7.44 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 36.7 µs per loop

>>> %timeit B1 * A
The slowest run took 10.27 times longer than the fastest. This could mean that an intermediate result is being cached 
100000 loops, best of 3: 6.64 µs per loop

I.e. for a 50x50 matrix, using broadcasting is in the order of 6 times as fast as using np.diag and matrix multiplication.