Extract data frame columns based on multiple criteria on column names

Question

I want to subset a data frame based on multiple column name criteria. I have a data frame as below:

id  team_col_code1   team_col_code2 ... team_col_code78   Gender State team_cost_code1   team_cost_code2 ... team_cost_code43

I am trying to subset this data frame such that the new dataset contains only columns containing column names containing the word "col" or "id" or "Gender".

I am able to create a subset based on column names containing the keyword col as shown below

new_Df <- df[grep("col", names(df))]

I am not sure how to include the other two columns id and Gender, into this subset such that the new dataset looks like this below

id  team_col_code1   team_col_code2   ... team_col_code78   Gender

Any help is much appreciated. Thanks.

Zheyuan Li Zheyuan Li · Accepted Answer · 2016-11-24T17:42:17

3

votes

It can be as straightforward as

df[c("id", grep("col", names(df), value = TRUE), "Gender")]

Extract data frame columns based on multiple criteria on column names

2 Answers