0
votes

I’m trying to remove rows in one dataframe (df1) in which values from three columns match values from another dataframe consisting of those same three columns (df2). So for example:

df1=data.frame(id=c(1552, 1552, 2501, 2504, 2504, 2504), month=c(4, 6, 7, 3, 4, 4), year=c(1970, 1970, 1971, 1971, 1971, 1972), weight=c(135, 654, 164, 83, 155, 195), sex=c('F', 'F', 'M', 'F', 'F', 'F'))

df2= data.frame (id=c(1552, 2504), month=c(6, 4), year=c(1970, 1971))

In the end I would like this:

id month year weight sex
1 1552     4 1970    135   F
2 2501     7 1971    164   M
3 2504     3 1971     83   F
4 2504     4 1972    195   F

This question seems similar: Subset a data frame based on another but I’m unable to successfully implement the suggested solution in my problem. Does anyone know how to do this?

1

1 Answers

3
votes

I think dplyr::anti_join will be helpful here

library(dplyr)
df1 <- data.frame(id = c(1552, 1552, 2501, 2504, 2504, 2504),
                  month = c(4, 6, 7, 3, 4, 4),
                  year = c(1970, 1970, 1971, 1971, 1971, 1972),
                  weight = c(135, 654, 164, 83, 155, 195),
                  sex = c('F', 'F', 'M', 'F', 'F', 'F'))
df2 <- data.frame(id = c(1552, 2504), month = c(6, 4), year = c(1970, 1971))
df1 %>% anti_join(df2)
## Joining by: c("id", "month", "year")
##     id month year weight sex
## 1 2504     4 1972    195   F
## 2 2504     3 1971     83   F
## 3 2501     7 1971    164   M
## 4 1552     4 1970    135   F