1
votes

I'm aware of how to select variables from a large data.frame based on the column name containing one defined string, as in: (How do I select variables in an R dataframe whose names contain a particular string?)

But how do I do this to select columns from the object that contain either one string or another?

I'd prefer not to have to split and recombine the df, so that the columns would be kept in their original order.

Here is my sample code, using grep, for obtaining variables matching the first string only, which works well:

df[grep("top",names(df),fixed=TRUE)]

grep won't take logical operators. So how do I select the second set of columns with "base" in the column name?

1
df[grep("top|base", names(df))]?talat
grep won't take logical operators -> actually it does; alternatively use str_detect from the stringr package, like so: df[str_detect(names(df), "top|base")]grrgrrbla
I had tried this, however it returns an df object with all of the rows but no variables in it. I don't think grep like logical operators like |.jmk
Oh, I've figured it out - using the argument fixed=TRUE means that the operator won't work :)jmk

1 Answers

0
votes

This should work:

df[grep("base",colnames(df))[2]]

or, in a somewhat more accurate and less error-prone style:

df[,grep("base",colnames(df))[2],drop=FALSE]

In both cases, the [2] at the end of the line specifies that you request the second column of df which contains the string "base" in its name.