R: Subset from two data frames based on multiple conditions

Question

I have two dataframes (df1 and df2), and I want a new dataframe (df3) containing all rows where "date" AND "time_of_day" of df1 match with df2. And save the rows of df1 that don't match as well in a new dataframe (df4).

I tried using dplyr filter function, but it seems like I am not writing it correctly, as I am getting a new dataframe of the same length as df1 but it should show me only the matching rows based on both variables date and time of day.

> df1
          date time_of_day     
1  2018-06-03     morning 
2  2018-06-06     afternoon 
4  2018-06-09     morning 
5  2018-06-10     afternoon 

> df2
          date time_of_day     
1  2018-06-03     morning 
2  2018-06-06     morning 
3  2018-06-08     morning 
4  2018-06-09     morning 
5  2018-06-10     afternoon
6  2018-06-11     afternoon

#creating a new data frame
df3 <- filter(df1, date %in% df2$date & time_of_day %in% df2$time_of_day)
#another try 
df3 <- df1[df1$date %in% df2$date & df1$time_of_day %in% df2$time_of_day,]

This is what I want:

> df3
          date time_of_day     
1  2018-06-03     morning 
2  2018-06-09     morning 
3  2018-06-10     afternoon 

> df4
          date time_of_day     
1  2018-06-06     afternoon

akrun akrun · Accepted Answer · 2019-06-06T15:40:18

We can do this with inner_join

library(dplyr)
df3 <- inner_join(df1, df2)
df3
#       date time_of_day
#1 2018-06-03     morning
#2 2018-06-09     morning
#3 2018-06-10   afternoon

and anti_join

df4 <- anti_join(df1, df2)
df4
#       date time_of_day
#1 2018-06-06   afternoon

data

df1 <- structure(list(date = c("2018-06-03", "2018-06-06", "2018-06-09", 
"2018-06-10"), time_of_day = c("morning", "afternoon", "morning", 
"afternoon")), class = "data.frame", row.names = c("1", "2", 
"4", "5"))

df2 <- structure(list(date = c("2018-06-03", "2018-06-06", "2018-06-08", 
"2018-06-09", "2018-06-10", "2018-06-11"), time_of_day = c("morning", 
"morning", "morning", "morning", "afternoon", "afternoon")),
class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))

R: Subset from two data frames based on multiple conditions

2 Answers

data