7
votes

How can I subset a data.table by using a variable, when the variable name is identical to an existing column name in the data.table? It works with get("varname",pos = 1), but is there are more robust/flexible solution?

library(data.table)

my_data_frame <- data.frame(
"V1"=c("A","B","C","A"),
"V2"=c(1, 2, 3, 4),
stringsAsFactors = FALSE        
)

V1 <- "A"

my_data_table <- as.data.table(my_data_frame)

# Can I improve this a bit? I want rows where V1 == "A", but use V1 in the statement 
my_data_table[ my_data_table$V1 == get("V1", pos = 1), ]

Renaming V1 is not an option.

UPDATE: I do not consider this a 100% duplicate. The accepted answer for this question is not acceptable for my question, since it uses explicit get which I do not want to use, as stated in the comments.

3
Perhaps a bit unorthodox to do row subsetting in j, but then we can use the 'dot dot notation': d[ , d[V1 == ..V1]] - Henrik
Another option is to specify the environment: my_data_table[V1 == get("V1", envir = .GlobalEnv)] - Jaap
It works here (I just used the shorter "d" as name of the data set). Do you have data.table version >= v1.10.2? - Henrik
another alternative similar to Henrik: d[d[, .I[V1 == ..V1]]] - chinsoon12

3 Answers

3
votes

If you don't mind doing it in 2 steps, you can just subset out of the scope of your data.table (though it's usually not what you want to do when working with ...):

wh_v1 <- my_data_table[, V1]==V1
my_data_table[wh_v1]
#   V1 V2
#1:  A  1
#2:  A  4
3
votes

Here is a solution using library(tidyverse):

library(data.table)
library(tidyverse)
my_data_frame <- data.frame(
  "V1"=c("A","B","C","A"),
  "V2"=c(1, 2, 3, 4),
  stringsAsFactors = FALSE        
)

V1 = "A"
my_data_table <- as.data.table(my_data_frame)
df = my_data_table %>% filter(V1 == !!get("V1")) #you do not have to specify pos = 1

If you want to make R use the object named "V1" you can do this

V1 = "A"
list_test = split(my_data_table, as.factor(my_data_table$V1)) #create a list for each factor level of the column V1.
df = list_test[[V1]] #extract the desired dataframe from the list using the object "V1"

Is it what you want?

1
votes

For equality conditions, you can use a join:

mDT = data.table(V1)
my_data_table[mDT, on=.(V1), nomatch=0]
#    V1 V2
# 1:  A  1
# 2:  A  4

Implicitly, the join condition in x[i, on=.(V1)] is

V1 == V1

where the LHS comes from x and the RHS from i. It is like a lookup of each row of i in x. The nomatch=0 means that any value found in i but not x is dropped from the output... for example

mDT2 = data.table(V1 = c("A", "D"))
my_data_table[mDT2, on=.(V1)]
#    V1 V2
# 1:  A  1
# 2:  A  4
# 3:  D NA

my_data_table[mDT2, on=.(V1), nomatch=0]
#    V1 V2
# 1:  A  1
# 2:  A  4