I am trying to write some wrapper functions to reduce code duplication with data.table
.
Here is an example using mtcars
. First, set up some data:
library(data.table)
data(mtcars)
mtcars$car <- factor(gsub("(.*?) .*", "\\1", rownames(mtcars)), ordered=TRUE)
mtcars <- data.table(mtcars)
Now, here is what I would usually write to get a summary of counts by group. In this case I am grouping by car
:
mtcars[, list(Total=length(mpg)), by="car"][order(car)]
car Total
AMC 1
Cadillac 1
Camaro 1
...
Toyota 2
Valiant 1
Volvo 1
The complication is that, since the arguments i
and j
are evaluated in the frame of the data.table
, one has to use eval(...)
if you want to pass in variables:
This works:
group <- "car"
mtcars[, list(Total=length(mpg)), by=eval(group)]
But now I want to order the results by the same grouping variable. I can't get any variant of the following to give me correct results. Notice how I always get a single row of results, rather than the ordered set.
mtcars[, list(Total=length(mpg)), by=eval(group)][order(group)]
car Total
Mazda 2
I know why: it's because group
is evaluated in the parent.frame
, not the frame of the data.table
.
How can I evaluate group
in the context of the data.table
?
More generally, how can I use this inside a function? I need the following function to give me all the results, not just the first row of data:
tableOrder <- function(x, group){
x[, list(Total=length(mpg)), by=eval(group)][order(group)]
}
tableOrder(mtcars, "car")