2
votes

I am trying to use grep to subset columns of a data frame with one row. When grep returns multiple columns the new data frame has the corresponding column names from the grep. When only one column is returned the column name is NULL... I am using this method because I am looping over many sites that may contain different combinations of HVAC sensor data.

I am trying to create subsets for each unit 'HVAC1', 'HVAC2', 'HVAC3' and a subset for columns that are common to all units. In this case, there is only one column that is common to all units : 'IAT' or indoor ambient temperature. Also, there is no third HVAC unit so the grep on HVAC 3 rightly returns names(sensordata.h3) as character(0).

Here is my code.

sensordata <- data.frame(sitetime = c("2015-10-22 14:15:17"), HVAC1RT = c(70.7), HVAC1ST = c(74.75), HVAC2RT = c(66.875), HVAC2ST = c(46.4), IAT = c(72.5))
sensordata
names(sensordata)

sensordata.h1 <- sensordata[,c(grep("HVAC1",names(sensordata)))]
sensordata.h1
names(sensordata.h1)

sensordata.h2 <- sensordata[,c(grep("HVAC2",names(sensordata)))]
sensordata.h2
names(sensordata.h2)

sensordata.h3 <- sensordata[,c(grep("HVAC3",names(sensordata)))]
sensordata.h3
names(sensordata.h3)

sensordata.common <- sensordata[,c(grep("IAT|OAT|IAH",names(sensordata)))]
sensordata.common
names(sensordata.common)
1
Could you highlight your coding problem, i.e. the actual problem you face? - coffeinjunky
Indeed, you might be missing the question. - Parfait
When only one of the columns is returned by the grep as in the creation of the subset called sensordata.common, the call to names(sensordata.common) returns NULL - user3794794
Try drop=F as an option to the subsetting. - coffeinjunky
Thanks! That worked! - user3794794

1 Answers

1
votes

Try this:

sensordata.common <- sensordata[,c(grep("IAT|OAT|IAH",names(sensordata))), drop=F]
sensordata.common
   IAT
1 72.5
names(sensordata.common)
[1] "IAT"

The option drop=F prevents [ to reduce the output to a vector. See ?[ (you need to use backticks around [, can't get it formatted here correctly...

Alternatively, you could use dplyr::select, as in select(sensordata.common, contains("your_names_here")). dplyr's default is to never change the output class.