Can you use existing linear algebra packages? If you are dealing with primative types, like double
BLAS is probably the most optimal way to go, but can have a steep learning curve. For a highly optimized but very user friendly library Eigen is one of my favourite option for such tasks in c++.
I would highly recommend using an existing linear algebra package (not even necessarily the ones I mentioned). It would make it easier to flesh out your ideas since the actual implementation is taken care of from the package. Not to mention that such packages have existed for years (several decades in case of BLAS) and should be very very VERY good at such tasks. Unless you really know what you are doing (have a very very specific task in mind with specific optimisations that you can code in) I doubt you can optimise as well as these libraries easily your self (if at all). Even then, there is a cost-benefit analysis to consider: is me doing this my self going to take how much time vs an existing good package doing it?
Although I highly recommend against doing this your self, if you absolutely must do it your self, one question that is unclear is are all the blocks the same size? Also in what form are the matrices stored, column or row major? Assuming blocks are the same size, and you have row major form, a sketch of what you could do is to iterate over blocks and relegate the block-block multiplication to a generic matrix multiplication function. I am dropping the double*&
and passing only pointers double*
. The operator[]
should take care of referencing the correct location, but cheack that I did the arithmetic inside []
correctly your self as well:
EDIT: If A
and B
are only storing upper triangular blocks I corrected the code
//Assuming all blocks are the same size
//Assuming matrix stored in row major form
#define NUMBER_OF_BLOCKS = MATRIX_SIZE/BLOCK_SIZE
void block_mult2(double* A, double* B, double* C){
for(size_t i=0; i<NUMBER_OF_BLOCKS; i++)
for(size_t j=0; j<NUMBER_OF_BLOCKS; j++)
for(size_t k=0; k<NUMBER_OF_BLOCKS; k++)
mult2(A[min(i,j)*BLOCK_SIZE*NUMBER_OF_BLOCKS + max(i,j)*BLOCK_SIZE],
B[min(j,k)*BLOCK_SIZE*NUMBER_OF_BLOCKS + max(j,k)*BLOCK_SIZE],
C[i*BLOCK_SIZE*NUMBER_OF_BLOCKS + k*BLOCK_SIZE]);
return;
}
void mult2(double* A, double* B, double* C){
for(size_t i=0; i<BLOCK_SIZE; i++)
for(size_t j=0; j<BLOCK_SIZE; j++)
for(size_t k=0; k<BLOCK_SIZE; k++)
C[i*BLOCK_SIZE+k] = A[min(i,j)*BLOCK_SIZE+max(i,j)]*B[min(j,k)*BLOCK_SIZE+max(j,k)];
return;
}
I cannot stress enough how much I recommend you drop all this and take a bit of time to learn a linear algebra package. You will rid yourself of alot of technical questions (ex that just came up: did I do pointer arithmetic correctly?) and you could use the package for so many more tasks. It will benefit your overall work I think.