15
votes

I'm trying to do := by group for an existing column of type 'integer' where the new values are of type 'double', which fails.

My scenario is mutating a column representing time into a POSIXct based on values in other columns. I could modify the creating of the data.table as a work around, but I'm still interested in how to go about actually changing the type of a column, as it is suggested in the error message.

Here's a simple toy example of my problem:

db = data.table(id=rep(1:2, each=5), x=1:10, y=runif(10))
db
id  x          y
 1:  1  1 0.47154470
 2:  1  2 0.03325867
 3:  1  3 0.56784494
 4:  1  4 0.47936031
 5:  1  5 0.96318208
 6:  2  6 0.83257416
 7:  2  7 0.10659533
 8:  2  8 0.23103810
 9:  2  9 0.02900567
10:  2 10 0.38346531

db[, x:=mean(y), by=id]   

Error in `[.data.table`(db, , `:=`(x, mean(y)), by = id) : 
Type of RHS ('double') must match LHS ('integer'). To check and coerce would impact performance too much for the fastest cases. Either change the type of the target column, or coerce the RHS of := yourself (e.g. by using 1L instead of 1)
1

1 Answers

15
votes

We can convert the class of 'x' column to 'numeric' before assigning the 'mean(y)' to 'x' as the class of 'x' is 'integer'. This may be useful if we are replacing 'x' with the mean of any other numeric variable (including 'x').

db[, x:= as.numeric(x)][, x:= mean(y), by=id][]

Or assign to a new column, and change the column name afterwards

setnames(db[, x1:= mean(y),by=id][,x:=NULL],'x1', 'x')

Or we can assign 'x' to 'NULL' and then create 'x' as the mean of 'y' ( @David Arenburg's suggestion)

db[, x:=NULL][, x:= mean(y), by= id][]