2
votes

Apologies in advance if this is double posting but I'm not having luck finding a solution to what I'm trying to make work here (and learn).

I'm trying to change my code to data.table approaches rather than data.frame because of the speed advantages as I'm dealing with hundreds of measurement files with each millions of values.

I have trouble figuring out how to code the following scenario: My columns have names consisting of 2 parts: Channel and parameter like: FWS.Maximum, FWS.Minimum

since the code has to work for instrument data with differing channels, I write it so that R automatically finds the Channel part and then loop through them with lapply. What I am trying to do here is calculate Range as Channel.Maximum column - channel.minimum column.

df[, FWS.Range := (FWS.Maximum - FWS.Minimum)]

works fine, but in the loop it would look like this:

x <- "FWS"

mydf[ , paste(x, "Range", sep = '.') := paste(x, "Maximum", sep = '.') - paste(x, "Minimum", sep = '.')]

but that throws the following error:

Error in paste(x, "Maximum", sep = ".") - paste(x, "Minimum", sep = ".") :
non-numeric argument to binary operator

Dummy data with only 5 columns to test it on ( real data has dozens that I need to adjust along this style )

mydf = data.table(ID = c(1,2,3,4,5), FWS.Maximum = c(12, 17,29, 22), FWS.Minimum = c(5,4,1,6),
FL.Red.Maximum = c(12, 17,29, 22), FL.Red.Minimum = c(5,4,1,6))

The code i'm trying to get this to work for is this:

lapply(substr(names(mydf)[grepl("Maximum", names(mydf))], 1, nchar(names(mydf)[grepl("Maximum", names(mydf))])-8), function(x) { 
  mydf[ paste(x, "Range", sep = '.'):= paste(x, "Maximum", sep = '.') - paste(x, "Minimum", sep = '.')]  })

which currently tells me

Error in :=(paste(x, "Range", sep = "."), paste(x, "Maximum", sep = ".") - : Check that is.data.table(DT) == TRUE. Otherwise, := and :=(...) are defined for use in j, once only and in particular ways. See help(":=").

1
you're trying to subtract strings, you need to get the strings (variable names) to turn them into the variables themselvesMichaelChirico
in addition to @MichaelChirico's comment, you probably need mydf[ , paste(x, "Range", sep = '.') := get(paste(x, "Maximum", sep = '.')) - get(paste(x, "Minimum", sep = '.'))]Jaap
yes, I was missing a comma as well indeed. Figured that out myself, and added invisible ( .... ) around it to stop the printingMark

1 Answers

3
votes

Thanks to the answers of MichaelChirrico and Jaap, and my own trying to stop the printing on the console:

 invisible(lapply(list.of.channels,  function(x) {
mydf[ , paste(x, "Range", sep = '.') := get(paste(x, "Maximum", sep = '.')) - get(paste(x, "Minimum", sep = '.'))]}))