1
votes

I'm having a hard time implementing a function with the Rcpp module using cppFunction. I need to use something like R's intersect with two NumericVector types and return another NumericVector with the result, just like in R.

This document has been of some help but unfortunately I'm pretty much a noob in C++ atm.

How could I implement the intersect R function with cppFunction ?

Thanks

1

1 Answers

6
votes

You would probably want to use something like the unordered_set to implement intersect:

File myintersect.cpp:

#include <Rcpp.h>
using namespace Rcpp;

// Enable C++11 via this plugin (Rcpp 0.10.3 or later)
// [[Rcpp::plugins(cpp11)]]

// [[Rcpp::export]]
NumericVector myintersect(NumericVector x, NumericVector y) {
    std::vector<double> res;
    std::unordered_set<double> s(y.begin(), y.end());
    for (int i=0; i < x.size(); ++i) {
        auto f = s.find(x[i]);
        if (f != s.end()) {
            res.push_back(x[i]);
            s.erase(f);
        }
    }
    return Rcpp::wrap(res);
}

We can load the function and verify it works:

library(Rcpp)
sourceCpp(file="myintersect.cpp")

set.seed(144)
x <- c(-1, -1, sample(seq(1000000), 10000, replace=T))
y <- c(-1, sample(seq(1000000), 10000, replace=T))
all.equal(intersect(x, y), myintersect(x, y))
# [1] TRUE

However, it seems this approach is a good deal less efficient than the itersect function:

library(microbenchmark)
microbenchmark(intersect(x, y), myintersect(x, y))
# Unit: microseconds
#               expr      min       lq   median        uq      max neval
#    intersect(x, y)  424.167  495.861  501.919  523.7835  989.997   100
#  myintersect(x, y) 1778.609 1798.111 1808.575 1835.1570 2571.426   100