3
votes

I am working to implement a code which was written in MATLAB into C++.

In MATLAB you can slice an Array with another array, like A(B), which results in a new array of the elements of A at the indexes specified by the values of the element in B.

I would like to do a similar thing in C++ using vectors. These vectors are of size 10000-40000 elements of type double.

I want to be able to slice these vectors using another vector of type int containing the indexes to be sliced.

For example, I have a vector v = <1.0, 3.0, 5.0, 2.0, 8.0> and a vector w = <0, 3, 2>. I want to slice v using w such that the outcome of the slice is a new vector (since the old vector must remain unchanged) x = <1.0, 2.0, 5.0>.

I came up with a function to do this:

template<typename T>
std::vector<T> slice(std::vector<T>& v, std::vector<int>& id) {

    std::vector<T> tmp;
    tmp.reserve(id.size());

    for (auto& i : id) {
        tmp.emplace_back(v[i]);
    }

    return tmp;
}

I was wondering if there was potentially a more efficient way to do such a task. Speed is the key here since this slice function will be in a for-loop which has approximately 300000 iterations. I heard the boost library might contain some valid solutions, but I have not had experience yet with it.

I used the chrono library to measure the time it takes to call this slice function, where the vector to be sliced was length 37520 and the vector containing the indexes was size 1550. For a single call of this function, the time elapsed = 0.0004284s. However, over ~300000 for-loop iterations, the total elapsed time was 134s.

Any advice would be much appreicated!

2
Is returning tmp vector a seg fault? I would have thought you would have to pass by ref or use dynamic storage (maybe wrapped in a smart ptr too).jackw11111
for (auto i : id) should run faster than for (auto& i : id), since it involves one less dereference per iteration of the innermost loop. Also, declaring your function parameters as const ref might help the compiler optimise your code.Paul Sanders
@jackw11111 No, returning a temporary or local variable by value is fine. Returning one by reference is not.Paul Sanders
@PaulSanders Is that because the values in the tmp vector outlive the function? ie. id is being passed by ref?jackw11111
@jackw11111 I'm not quite sure what you mean by a seg faultluca

2 Answers

3
votes

emplace_back has some overhead as it involves some internal accounting inside std::vector. Try this instead:

template<typename T>
std::vector<T> slice(const std::vector<T>& v, const std::vector<int>& id) {

    std::vector<T> tmp;
    tmp.resize (id.size ());

    size_t n = 0;
    for (auto i : id) {
        tmp [n++] = v [i];
    }

    return tmp;
}

Also, I removed an unnecessary dereference in your inner loop.


Edit: I thought about this some more, and inspired by @jack's answer, I think that the inner loop (which is the one that counts) can be optimised further. The idea is to put everything used by the loop in local variables, which gives the compiler the best chance to optimise the code. So try this, see what timings you get. Make sure that you test a Release / optimised build:

template<typename T>
std::vector<T> slice(const std::vector<T>& v, const std::vector<int>& id) {

    size_t id_size = id.size ();
    std::vector<T> tmp (id_size);
    T *tmp_data = tmp.data ();

    const int *id_data = id.data ();
    const T* v_data = v.data ();

    for (size_t i = 0; i < id_size; ++i) {
        tmp_data [i] = v_data [id_data [i]];
    }

    return tmp;
}
3
votes

The performance seems a bit slow; are you building with compiler optimizations (eg. g++ main.cpp -O3 or if using an IDE, switching to release mode). This alone sped up computation time around 10x.

If you are using optimizations already, by using basic for loop iteration (for int i = 0; i < id.size(); i++) computation time was sped up around 2-3x on my machine, the idea being, the compiler doesn't have to resolve what type auto refers to, and since basic for loops have been in C++ forever, the compiler is likely to have lots of tricks to speed it up.

template<typename T>
std::vector<T> slice(const std::vector<T>& v, const std::vector<int>& id){

    // @Jan Schultke's suggestion
    std::vector<T> tmp(id.size ());

    size_t n = 0;
    for (int i = 0; i < id.size(); i++) {
        tmp [n++] = v [i];
    }

    return tmp;
}