In R, how do I delete rows in a data frame by column names of another data frame?

Question

I have one dataframe (df1) with more than 200 columns containing data (several thousands of rows each). Column names are alphanumeric and all distinct from each other.

I have a second dataset (df2) with a couple of columns where the first column (named 'col1') contains rows with "values" carrying colnames of df1.

But not for every row in df2 I have a corresponding column in df1.

Now I would like to delete (drop) all rows in df2 where there is no "corresponding" column in df1.

I searched quite a while using keywords like "subset data.frame by values from another data.frame" but did not find any solution. I checked, e.g. here, here or here and some other places.

Thanks for your help.

Can you create a small reproducible example? See tips here - use built-in data, or simulate data, or use dput() to share reproducibly. — Gregor Thomas
But maybe what you want is df2[df2$col1 %in% names(df1), ]. It doesn't seem to matter at all that df1 is a data frame, the only thing that matters is that you have a chracter vector of values you want to keep, and that happens to be names(df1). — Gregor Thomas

effel effel · Accepted Answer · 2016-06-08T20:54:44

Data:

df1 <- data.frame(a = 1:3, b = 1:3)
#   a b
# 1 1 1
# 2 2 2
# 3 3 3

df2 <- data.frame(col1 = c("a", "c"))
#   col1
# 1    a
# 2    c

Keep rows in df2 whose values are names in df1:

subset(df2, col1 %in% names(df1))
#   col1
# 1    a

In R, how do I delete rows in a data frame by column names of another data frame?

1 Answers