I have a mix of sparse vectors and dense vectors. I need to calculate complex products of them.
Sparse vectors have mostly zeros, and I need to ignore the zeros.
One common way to encode sparse vectors is to encode [index, value], for all the nonzero values. It uses 2X memory for all the non-zero elements.
But over 50% non-zero elements, the [index, value] encoding starts using more memory than the dense encoding [0, x1, ... xn] (the plain array).
The question is if there is some intermediate encoding system that automatically adapts to the percentage of zeros, using always less or equal memory than the dense encoding.
I do not want to reinvent the wheel, and this should be a common problem with a lot of solutions, but I don't know where or what to search for.
For calculating the product, I need to rapidly read the vectors, and rapidly create the result of the product, so I searched for compression with random access, but what I found are compression algorithms too slow compared to the [index, value] access time, which take a lot of computing time to create new vectors.
If I use compression, ideally, I would rapidly uncompress (decode) the vector into [index, value], make the product, and re-compress [index, value] where index
is different than the indexes on the operands (For more info: indexY= index1 XOR index2, so the sorting of the product is different than on the operands).
I think that I need to compress only the positions, not the values of the arrays.
But I hope that there is already a common method for just doing something intermediate between [index, value] and a plain 1D array. I just don't know what keywords I need to search.