I am using the wonderful R data.table package. However, accessing (i.e. manipulating by reference) a column with a variable name is very clumsy: If we are given a data.table dt
which has two columns x and y and we want to add two columns and name it z then the command is
dt = dt[, z := x + y]
Now let us write a function add
that takes as arguments a (reference to a) data.table dt
and three column names summand1Name
, summand2Name
and resultName
and it is supossed to execute the exact same command as above only with general column names. The solution I am using right now is reflection, i.e.
add = function(dt, summand1Name, summand2Name, resultName) {
cmd = paste0('dt = dt[, ', resultName, ' := ', summand1Name, ' + ', summand2Name, ']')
eval(parse(text=cmd))
return(dt) # optional since manipulated by reference
}
However I am absolutely not satisfied with this solution. First of all it's clumsy, it does not make fun to code like this. It is hard to debug and it just pisses me off and burns time. Secondly, it is harder to read and understand. Here is my question:
Can we write this function in a somewhat nicer way?
I am aware of the fact that one can access columns with variable name like so: dt[[resultName]]
but when I write
dt[[resultName]] = dt[[summand1Name]] + dt[[summand2Name]]
then data.table starts to complain about having taken copies and not working by reference. I don't want that. Also I like the syntax dt = dt[<all 'database related operations'>]
so that everything I am doing is stuck together in one pair of brackets. Isn't it possible to make use of a special symbol like backticks or so in order to indicate that the name currently used is not referencing an actual column of the data table but rather is a placeholder for the name of an actual column?
get
andmget
– talatadd = function(dt, summand1Name, summand2Name, resultName) dt[, (resultName) := .SD[[summand1Name]] + .SD[[summand2Name]]]
? Another option could beadd2 = function(dt, summand1Name, summand2Name, resultName) dt[, (resultName) := eval(as.name(summand1Name)) + eval(as.name(summand2Name))]
or just useget
as suggested above. – David Arenburg