Create column names based on "by" argument the data.table way

Question

Say I have the following data.table

dt <- data.table(var = c("a", "b"), val = c(1, 2))

Now I want to add two new columns to dt, named a, and b with the respective values (1, 2). I can do this with a loop, but I want to do it the data.table way.

The result would be a data.table like this:

dt.res <- data.table(var = c("a", "b"), val = c(1, 2), #old vars 
                     a = c(1, NA), b = c(NA, 2)) # newly created vars

So far I came up with something like this

dt[, c(xx) := val, by = var]

where xx would be a data.table-command similar to .N which addresses the value of the by-group.

Thanks for the help!

Appendix: The for-loop way

The non-data.table-way with a for-loop instead of a by-argument would look something like this:

for (varname in dt$var){
     dt[var == varname, c(varname) := val]
}

Try dt[, c('a', 'b') := Map('*', lapply(var, function(x) val[NA^(x!=var)]), val)] — akrun
The bigger picture would be nice to see. What are you really trying to do? — Matt Dowle
I am trying to write a function that takes dates of financial transaction constructs a portfolio out of this information. The question concerns a part of the function where the number of stocks held in the portfolio is calculated. For example, stock "a" is held with a quantity of 1 and stock "b" with a quantity of 2. The next steps are merging the data.table with another one, containing the prices and multiplying the values to get the portfolio value. Does that help in any way? — David
What's wrong with dt[, ':='(var1=val1, var2=val2...) ] ? Do you need to generate or pass the names programmatically? — smci
I don't understand "xx would be a data.table-command similar to .N which addresses the value of the by-group". Can you give an example? Do you want to form the variable name vector in xx based on the names in the by-argument? Please give an example. — smci

akrun akrun · Accepted Answer · 2015-10-12T08:11:40

Based on the example showed, we can use dcast from the data.table to convert the long format to wide, and join with the original dataset on the 'val' column.

library(data.table)#v1.9.6+
dt[dcast(dt, val~var, value.var='val'), on='val']
#   var val  a  b
#1:   a   1  1 NA
#2:   b   2 NA  2

Or as @CathG mentioned in the comments, for previous versions either merge or set the key column and then join.

merge(dt, dcast.data.table(dt, val~var, value.var='val'))

Create column names based on "by" argument the data.table way

1 Answers