Comparing two data frames and filter the values based on their values in r

Question

I want to compare two data frames in R with same column names (df1 & df2). Based on the values in each of the columns in one of them (df2) I want to filter the other one (df1). I need to eliminate rows in df1 that are greater or equal than the values in df2 with respect to each column name. In other words, in need to produce res1 below:

df1 <- data.frame( v1 = c(1,2,3,4), v2 = c(2, 10, 5, 11), v3=c(20, 25, 23, 2), v4=c(1,2,1,3) )  

> df1
  v1 v2 v3 v4
1  1  2 20  1
2  2 10 25  2
3  3  5 23  1
4  4 11  2  3

df2 <- data.frame(v1 = 4, v2 = 10, v3 =30, v4 = 3)

> df2
  v1 v2 v3 v4
1  4 10 30 3

So, the desired output res1 is generated by comparing each row in df1 with df2 based on column names and eliminating the rows in df1 that are greater or equal than specific column threshold defined in df2:

> res1
  v1 v2 v3 v4
1  1  2 20  1
2  3  5 23  1

df2 would always be one row dataframe? What if it has multiple rows? Should we compare it with each row? — Ronak Shah
@RonakShah It is always one row data frame. In df2 I defined threshold values for deleting rows in df1. — Makaroni

Sotos Sotos · Accepted Answer · 2017-01-26T09:05:38

We can use mapply with < sign to compare the two data frames, and use rowSums to index for subseting, i.e.

df1[rowSums(mapply(`<`, df1, df2)) == ncol(df1),]
#  v1 v2 v3 v4
#1  1  2 20  1
#3  3  5 23  1

Additionally, a fully Vectorized translation of the above can be (compliments of @RonakShah),

df1[rowSums(df1 < df2[rep(1, nrow(df1)), ]) == ncol(df1), ]

Comparing two data frames and filter the values based on their values in r

3 Answers