I'm trying to simplify this data.table
two-stage process which acts on both numeric and character variables. E.g. - take the first element of textvar
and sum
each of the numeric variables. Consider this small example:
library(data.table)
dt <- data.table(grpvar=letters[c(1,1,2)], textvar=c("one","two","one"),
numvar=1:3, othernum=2:4)
dt
# grpvar textvar numvar othernum
#1: a one 1 2
#2: a two 2 3
#3: b one 3 4
Now my first thought was to nest .SD
to drop the one variable out of the lapply
call, but I thought that was a bit complicated:
dt[, c(textvar=textvar[1], .SD[, lapply(.SD, sum), .SDcols=-c("textvar")]), by=grpvar]
# grpvar textvar numvar othernum
#1: a one 3 5
#2: b one 3 4
Then I thought maybe I could do each grouping separately and join them, but that seems even worse:
dt[, .(textvar=textvar[1]), by=grpvar][
dt[, lapply(.SD, sum), by=grpvar, .SDcols=-c("textvar")], on="grpvar"
]
# grpvar textvar numvar othernum
#1: a one 3 5
#2: b one 3 4
Is there a simpler construction that would get around the nesting of .SD
or the joining? I feel like I'm overlooking something elementary.