Can data.table handle identical column names when using .SDcols?

Question

When using .SD to apply a function to a subset of dt's columns I can't seem to find the correct way to handle the situation where I have duplicated column names... e.g.

#  Make some data
set.seed(123)
dt <- data.table( matrix( sample(6,16,repl=T) , 4 ) )
setnames(dt , rep( letters[1:2] , 2 ) )
#   a b a b
#1: 2 6 4 5
#2: 5 1 3 4
#3: 3 4 6 1
#4: 6 6 3 6

#  Use .SDcols to multiply both column 'a' specifying them by numeric position
dt[ , lapply( .SD , `*`  , 2 ) , .SDcols = which( names(dt) %in% "a" ) ]
#    a  a
#1:  4  4
#2: 10 10
#3:  6  6
#4: 12 12

I couldn't get it to work with when .SDcols was a character vector of column names so I tried numeric positions (which( names(dt) %in% "a" ) gives a vector [1] 1 3 ) but it also seems to just multiply the first a column only. Am I doing something wrong?

.SDcols Advanced. Specifies the columns of x included in .SD. May be character column names or numeric positions.

These also returned the same result as above...

dt[ , lapply( .SD ,function(x) x*2 ) , .SDcols = which( names(dt) %in% "a" ) ]
dt[ , lapply( .SD ,function(x) x*2 ) , .SDcols = c(1,3) ]

packageVersion("data.table")
#[1] ‘1.8.11’

See here for on-going discussion on this topic and here for the bug filed by @Ricardo — Arun
Is this just playing around or do you have a reason for duplicated names? If it's the latter, please contribute to the post Arun mentioned. — eddi

CHP CHP · Accepted Answer · 2013-11-06T12:11:23

How about this

dt[, "a"] * 2
##    a a.1
## 1  4   8
## 2 10   6
## 3  6  12
## 4 12   6

For more detailed discussion

https://chat.stackoverflow.com/transcript/message/12783493#12783493

Can data.table handle identical column names when using .SDcols?

2 Answers

This now works as intended since 1.9.4. From NEWS: