When I create a string column using data.table, using the data.frame parameter stringsAsFactor = F, the resulting data.table uses stringsAsFactor = F parameter correctly, but then the adds an extra column "stringsAsFactor". It is easy enough to get rid of the extra column. But is there a way to tell data.frame not to add columns based on the data.frame parameter? I.e., is this a bug or a feature? See ToyExample below:
library(data.table)
factorTest <- sample(c('O','A', 'B','AB'), 50, replace = T)
summary(factorTest)
Length Class Mode
50 character character
summary(as.factor(factorTest))
A AB B O
10 18 7 15
test1 <- data.frame(dabo = factor(factorTest,
levels = c('O','A','B','AB')), dabostr = factorTest,
stringsAsFactors = F)
test2 <- data.table(dabo = factor(factorTest,
levels = c('O','A','B','AB')), dabostr = factorTest,
stringsAsFactors = F)
summary(test1)
dabo dabostr
O :15 Length:50
A :10 Class :character
B : 7 Mode :character
AB:18
summary(test2)
dabo dabostr stringsAsFactors
O :15 Length:50 Mode :logical
A :10 Class :character FALSE:50
B : 7 Mode :character NA's :0
AB:18
data.tablesimply don't have thestringsAsFactorsargument- see?data.table. So you are basically just creating a new column. The reason the strings aren't converting to factors likedata.frameis because it the defaultdata.tablebehavior. - David Arenburg