1
votes

I want to build a block diagonal matrix from a list of square matrices of the same shape.

This answer explains how to do it for a single matrix, but I have a whole bunch of different matrices that need to be made into a block matrix.

I want to perform this operation in Theano on GPU, so performance is a necessity (and the function has to exist in Theano).

Details : The reason for doing this is to accelerate the calculation of eigenvalues/vectors on GPU when there are many small matrices (e.g. around 10000 matrices 7x7). Instead of taking the eigenvalues of each small matrix separately (very slow on GPU), I want to perform a big EVD over the block diagonal matrix (same eigenvalues than the small matrices). Hopefully this will be faster and the sparsity of the matrix will not create much overhead (or maybe EIGH will take advantage of that).

2
There is a block diagonal method in scipy, maybe you could use that. Why do you need to construct the matrix in theano? docs.scipy.org/doc/scipy-1.0.0/reference/generated/… - Jacques Kvam
@JacquesKvam I noticed that scipy has this function, though I am looking for an implementation in numpy / theano. Why do you ask for the reason ? I am using theano, so I need it in theano ! - Toool
If you only need to construct the matrix one time, you could do it with numpy/scipy and convert it to theano. Hard to say without more details though. - Jacques Kvam
@JacquesKvam Added some details. Of course if I needed to do it only once I'd not do it in theano : problem is I have to perform this many times during training of a DNN (actually, as a part of the gradient descent algorithm). - Toool
This Mathematica answer does it although I'm having trouble to translate the code in numpy. - Toool

2 Answers

2
votes

Thank you Tool

I have modified the code a bit so that it returns x on the k-th diagonal.

def block_diagonal(x, k):
''' x should be a tensor-3 (#num matrices, n,n)
    k : int
    Diagonal in question. it is 0 in case of main diagonal. 
    Use k>0 for diagonals above the main diagonal, and k<0 for diagonals below the main diagonal.
'''

shape = x.shape
n = shape[-1]

absk = abs(k)

indx = np.repeat(np.arange(n),n)
indy = np.tile(np.arange(n),n)

indx = np.concatenate([indx + a * n for a in range(shape[0])])
indy = np.concatenate([indy + a * n for a in range(shape[0])])

if k<0: 
    indx += n*absk
else:
    indy += n*absk

block = np.zeros(((shape[0]+absk)*n,(shape[0]+absk)*n))
block[(indx,indy)] = x.flatten()

return block
0
votes

For future readers : I've found a way to do it purely in NumPy, which is slightly faster than the SciPy function :

def block_diagonal(x):
" x should be a tensor-3 (#num matrices, n,n) "

shape = x.shape
n = shape[-1]

indx = np.repeat(np.arange(n),n)
indy = np.tile(np.arange(n),n)

indx = np.concatenate([indx + k * n for k in range(shape[0])])
indy = np.concatenate([indy + k * n for k in range(shape[0])])

block = np.zeros((shape[0]*n,shape[0]*n))
block[(indx,indy)] = x.flatten()

return block

This implementation simply builds up the indices where the blocks will be situated, then fills it !

Timings :

matrix_list = [np.arange(i,i+49).reshape((7,7)) for i in range(1000)]

matlist = np.array(matrix_list)

%timeit block_diagonal(matlist)
25.6 ms ± 145 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit scipy.linalg.block_diag(*matrix_list)
28.6 ms ± 227 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

matrix_list = [np.arange(i,i+49).reshape((7,7)) for i in range(5000)]
matlist = np.array(matrix_list)

%timeit block_diagonal(matlist)
141 ms ± 75.2 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit scipy.linalg.block_diag(*matrix_list)
157 ms ± 201 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Note : The same function in Theano is way slower than its NumPy counterpart, probably because of the need to use scan for the concatenation/shift of the indices. Any idea about how to tackle this problem is welcome !