In a new user created function I like to do some data.table transformations, especially I like to create a new column with the ':=' command.
Assume I like to make a new column called Sex that capitalizes the first letter of the column df$sex in my example data.frame df.
The output of my prepare function should be a data.table with the same name as before but with the additional "capitalised" column.
I try several ways to loop over the data.table. However I always get the following warning (and no correct output):
Warning message: In
[.data.table
(x, ,:=
(Sex, stringr::str_to_title(sex))) : Invalid .internal.selfref detected and fixed by taking a (shallow) copy of the data.table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or was created manually using structure() or similar). Avoid names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.
library(data.table)
library(magrittr)
library(stringr)
df <- data.frame("age" = c(17, 04),
sex = c("m", "f"))
df %>% setDT()
is.data.table(df)
This is the easiest way to write my function:
prepare1<-function(x){
x[,Sex:=stringr::str_to_title(sex)]
}
prepare1(df)
#--> WARNING. (as block quoted above)
prepare2<-function(x){
x[, `:=`(Sex, stringr::str_to_title(sex))]
}
prepare2(df)
#--> WARNING. . (as block quoted above)
prepare3<-function(x){
require(data.table)
y <-as.data.table(list(x))
y <- y[,Sex:=stringr::str_to_title(sex)]
x <<- y
}
prepare3(df)
The last version does NOT throw the warning, but makes a new dataset called x. But I wanted to override the dataset I put in the function (if I have to go that way at all.)
From the := help file I also know I can use set, however I am not able to adapt the command appropriate. In case that could cure my problem I am happy to receive help on that, too! set(x, i = NULL, Sex, str_to_title(sex))
is apparently wrong ...
Up on request/to make the discussion in the comments clearer I show how my code produces the problem
library(data.table)
library(stringr)
df <- data.frame("age" = c(17, 04),
sex = c("m", "f"))
GetLastAssigned <- function(match = "<- *data.frame",
remove = " *<-.*") {
f <- tempfile()
savehistory(f)
history <- readLines(f)
unlink(f)
match <- grep(match, history, value = TRUE)
get(sub(remove, "", match[length(match)]))
}
#ok, no need for magrittr
setDT(GetLastAssigned())
#check the last function worked
is.data.table(df)
prepare1<-function(x){
x[,Sex:=stringr::str_to_title(sex)]
}
prepare1(GetLastAssigned())
# I get a warning and it does not work.
prepare1(df)
# I get a warning and it does not work, either.
#If I manually type setDT(df) everything works fine but I cannot type the "right" dfs at all the places where I need to do this transformation.
magrittr
. If you just dosetDT(df)
this works as intended. – Roland`%>%`
you see quite a few functions that are good candidates for this kind of issues. – RolandsetDT
to convert to data.tables before providing it to a function. But I'd like that these cases be resolved" – Frank