In the context of implementing a "search window" for fast dynamic time warping in C, I need to build a structure that represents a sparse binary matrix with a very particular structure.
This matrix has some strong structural constraints on it: specifically, there is only one "run" of non-zero data per row, although that run is of variable length. The value of the starting cell's column index and ending cell's column index increases monotonically as the row increases:
Example valid matrices (* = filled, . = unfilled)
0 1 2 3 4 5
0 * * * . . .
1 . * * * . .
2 . * * * . .
3 . . * * * *
4 . . * * * *
0 1 2 3 4 5
0 * * * * . .
1 . * * * * .
2 . . * * * .
3 . . . . * *
4 . . . . . *
Example invalid matrices:
0 1 2 3 4 5
0 * * * . * * <-- more than one continuous run
1 . * * * . .
2 . * * * . .
3 . . * * * *
4 . . * * * *
0 1 2 3 4 5
0 * * * . . .
1 . * * * . .
2 . * * * . .
3 * * * * * * <-- start index not monotonically increasing
4 . . * * * *
0 1 2 3 4 5
0 * * * . . .
1 . * * * * .
2 . * * * . . <-- end index not monotonically increasing
3 . . * * * *
4 . . * * * *
These sparse matrices are mostly populated on the diagonals, but they are not square, so I don't know if I should be using "jagged diagonal storage".
My current approach is to store (start, end) column indices for each row: i.e.
0 1 2 3 4 5
0 * * * . . .
1 . * * * . .
2 . * * * . .
3 . . * * * *
4 . . * * * *
is (logically) stored as
(0, 0) --> (0, 2)
(1, 1) --> (1, 3)
(2, 1) --> (2, 3)
(3, 2) --> (3, 5)
(4, 2) --> (4, 5)
(although these are internally stored as ravelled indices, i.e.)
(0 * num_cols + 0) --> (0 * num_cols + 2)
(1 * num_cols + 1) --> (1 * num_cols + 3)
so the final structure ends up looking like
[0, 2, 7, 9, 13, 15, 20, 23, 26, 29]
Then, this structure is delta encoded like
[first_value, offset_1, offset_2, ...]
[0, 2, 5, 2, 4, 2, 5, 3, 3, 3]
Because these delta values are small and positive, we can take advantage of packing them into some flavor of VARINT structure.
First question: do these matrices have a well-known name? Can't seem to find much in the literature.
Second question: is this storage scheme taking advantage of all the strong properties of the matrices? Can I abuse the constraints in order to store less data somehow?
I have read a few documents describing sparse matrix storage for general sparse matrices, but it strikes me that this special case structure might have a special case optimal storage / encoding scheme.