2
votes

I want to subset a data frame based on multiple column name criteria. I have a data frame as below:

id  team_col_code1   team_col_code2 ... team_col_code78   Gender State team_cost_code1   team_cost_code2 ... team_cost_code43 

I am trying to subset this data frame such that the new dataset contains only columns containing column names containing the word "col" or "id" or "Gender".

I am able to create a subset based on column names containing the keyword col as shown below

new_Df <- df[grep("col", names(df))]

I am not sure how to include the other two columns id and Gender, into this subset such that the new dataset looks like this below

id  team_col_code1   team_col_code2   ... team_col_code78   Gender

Any help is much appreciated. Thanks.

2
@ZheyuanLi toy dataset from UCLA online repository - Diggy Detroit
@ZheyuanLi fixed it :) - Diggy Detroit

2 Answers

3
votes

It can be as straightforward as

df[c("id", grep("col", names(df), value = TRUE), "Gender")]
0
votes

Have to work this way:

df[,grepl("col|id|Gender",colnames(df))]