Efficiently inserting column/row into a matrix stored in row/col-major vector in-place

Question

It's not hard to efficiently insert row or column into a matrix stored in a row- or col-major (respectively) vector. The problem of inserting row into a col-major vector or column into a row-major vector is slightly more interesting.

For example, given a 2x3 matrix stored in row-major in vector:

1 2 3    <=>    1 2 3 4 5 6 
4 5 6

and a column 7 8 that is inserted before after column 1 in the original matrix, we get:

1 7 2 3    <=>    1 7 2 3 4 8 5 6 
4 8 5 6

[Inserting a row into a col-major vector is similar.]

The sample setup in C++:

auto m = 2; // #rows
auto n = 3; // #cols
// row-major vector
auto x = std::vector<double>{1,2,3,4,5,6};

auto const colIndex = 1;
auto const col = std::vector<double>{7,8};
// insert column {7,8} into the 2nd position
// =>{1,7,2,3,4,8,5,6}

There could be various options to achieve this algorithmically and in C++, but we're looking for the efficiency and scalability to large matrices and multiple inserts.

The first obvious option that I can think of is to use std::vector<double>::insert to insert new elements to the correct positions:

//option 1: insert in-place
x.reserve(m*(n+1));
for(auto i = 0; i < col.size(); i++)
   x.insert(begin(x) + colIndex + i * (n + 1), col[i]);

, which is valid but extremely slow even for moderate data sizes because of the resizing and shifting on each iteration.

Another, more direct option is to create another vector, populate all the columns in the ranges [0,colIndex),colIndex,(colIndex,n+1], and swap it with the original vector:

// option 2: temp vec and swap
{
    auto tmp = std::vector<double>(m*(n+1));

    for(auto i = 0; i < m; i++)
    {
        for(auto j = 0; j < colIndex; j++)
            tmp[j + i * (n + 1)] = x[j + i * n];

        tmp[colIndex + i * (n + 1)] = col[i];

        for(auto j = colIndex + 1; j < n + 1; j++)
            tmp[j + i * (n + 1)] = x[(j - 1) + i * n];
    }

    std::swap(tmp, x);
};

This is much faster than the option 1, but requires extra space for the matrix copy and iterating over all elements.

Are there any other ways to achieve this that would beat the above in speed/space or both?

Example code on ideone: https://ideone.com/iXrPfF

Out of curiosity have you tried your first option, but first calling std::vector::reserve() so that it only has to reallocate once? There will still be duplicate shifting (which can be optimized out). I'm crafting an answer to address that part of it. — TypeIA
@TypeIA good point - thanks. I did try it - the improvement is negligible even for matrices of size like 100k*10k. But it's still better, you're right - I'll edit. — Oleg Shirokikh
why are you starting your shifting at the front of the vector? Start it at the back then you only have to shift things once, and it becomes a single for loop rather than one with nested for loops — UKMonkey
Generally it makes more sense to figure out which container to use based on what needs to be done with it, instead of picking a container first, and then figure out how to what needs to be done in the most efficient manner. If it's necessary to have both efficient row and column insertions into a matrix, then a one-dimension vector is not the right container. It's going to be hard to figure out how to implement this efficiently. One will need a fairly complicated, non-linear, container to do this. — Sam Varshavchik
@SamVarshavchik thanks - true story. however this insert functionality is just one of many and not the most important operation. there're other linear algebra routines that require contiguous memory, so array/vector is pretty much the only choice here — Oleg Shirokikh

TypeIA TypeIA · Accepted Answer · 2018-01-18T22:35:34

This version is likely to be much faster, especially at scale, and could be the basis for further micro-optimization (if [and only if] really necessary):

// one-time reallocation of the vector to get space for the new column
x.resize(x.size() + col.size());

// we'll start shifting elements over from the right
double *from = &x[m * n];
const double *src = &col[m];
double *to = from + m;

size_t R = n - colIndex; // number of cols left of the insert
size_t L = colIndex;     // number of cols right of the insert

while (to != &x[0]) {
    for (size_t i = 0; i < R; ++i) *(--to) = *(--from);
    *(--to) = *(--src); // insert value from new column
    for (size_t i = 0; i < L; ++i) *(--to) = *(--from);
}

ideone

This doesn't require any temporary allocation and aside from possible micro-optimizations of the loop it's probably about as fast as it gets. To understand how it works, we can start by observing that the bottom-right element of the original matrix is being shifted m elements to the right in the source vector. Working backwards from the last element, at some point a value from the inserted column vector gets inserted, and subsequent elements from the source vector are now shifted m - 1 only elements to the right. Using that logic we simply construct a 3-phase loop that works from right to left on the source array. The loop iterates m times, once for each row. The three phases of the loop, corresponding to its three lines of code, are:

Shift row elements that are "to the right" of the insertion point.
Insert the row value from the new column.
Shift row elements that are "to the left" of the insertion point (shifting one less place than in phase 1).

There's also serious room for improvement in the naming of the variables, and the algorithm should certainly be encapsulated in its own function with proper input parameters. One possible signature would be:

void insert_column(std::vector<double>& matrix,
    size_t rows, size_t columns, size_t insertBefore,
    const std::vector<double>& column);

From here there's further room for improvement in making it generic using templates.

And from there, you might observe that the algorithm has possible application beyond matrices. What's really happening is that you're "zippering" two vectors together with a skip and an offset (i.e., starting at element i, insert an element from B into A after every n'th element).

Efficiently inserting column/row into a matrix stored in row/col-major vector in-place

3 Answers