0
votes

I am trying to write a function that will take a data.frame, a list (or a character vector) of variable names of the data.frame and create some new variables with names derived from the corresponding variable names in the list and values from the variables named in the list.

For example, if data.frame d has variable x, y, z, w, the list of names is c('x', 'z') the output maybe vectors with names x.cat, z.cat and values based on values of d$x and d$z.

I can do this with a loop

df <- data.frame(x = c(1 : 10), y = c(11 : 20), z = c(21 : 30), w = c(41: 50))  

vnames <- c("x", "w")

loopfunc <- function(dat, vlst){
  s <- paste(vlst, "cat", sep = ".")
  for (i in 1:length(vlst)){
  dat[s[i]] <- NA
  dat[s[i]][dat[vlst[i]] %% 4 == 0 ] <- 0
  dat[s[i]][dat[vlst[i]] %% 4 == 1 | dat[vlst[i]] %%4 == 3] <- 1
  dat[s[i]][dat[vlst[i]] %% 4 == 2 ] <- 2
 }
  dat[s]
}
dout <- loopfunc(df, vnames)

This would output a 10x2 data.frame with columns x.cat and w.cat, the values of these are 0, 1, or 2 depending on the remainder of the corresponding values of df$x and df$w mod 4.

I would like to find a way to something like this without loop, maybe using the apply functions?

Here is a failed attempt

noloopfunc <- function(dat, l){
  assign(l[2], NA)
  assign(l[2][d[l[1]] %% 4 == 0], 0)
  assign(l[2][d[l[1]] %% 4 == 2], 2)
  assign(l[2][(d[l[1]] %% 4 == 1) | (d[l[1]] %% 4 == 3)], 1)
  as.name(l[2])
}

newvnames <- sapply(vnames, function(x){paste(x, "cat", sep = ".")})
vpairs <- mapply(c, vnames, newvnames, SIMPLIFY = F)

lapply(vpairs, noloopfunc, d <- df)

Here the formal argument l is supposed to represent vpairs[[1]] or vpairs[[2]], both string vectors of length 2.

I found several threads on Stackoverflow on converting strings to variable names but I couldn't find anything where it is used in this way where the variables have to be referred to subsequently and assigned values in a non interactive way.

Thanks for any help.

1
You can do this with assign, but why would you ever want to? This isn't an end in itself, and there's almost certainly a better way to accomplish whatever you're really trying to do. Just put them in a list or even in an environment. - Gregor Thomas
Well I actually can't do it with assign here. I think there is a better way too or I wouldn't be asking here, would I? Please show me if you know a better way. I already know it has something to do with list or even environment too. This kind of general off hand remarks are not very helpful. - bee
list2env(df[,c('x','z')]) and attach? I'm not a fan (and agree with @Gregor), but it's an option. - r2evans
And @bee, very often questions are asked poorly, so these kind of suggestions lead new programmers towards a different line of thinking about the problem. Please differentiate between "well-intentioned comment that I cannot use (for some reason)" and off-hand, which his comment was not. (I'm not saying you're a new programmer ... just speaking generically about comments like that.) - r2evans
I think the better way is to go straight to your goal, whatever you want to do with these variables. Share with us what your next step(s) are and maybe we can help you get there without this step. If you start this way I think your next question will be about trouble getting eval(parse(...)) to work for you, which is hard to write, harder to debug, and hardest to maintain. There is almost certainly a better way but you need to tell us where you're going. - Gregor Thomas

1 Answers

0
votes

You can replace your loop with an apply variant

dout <- as.data.frame(sapply(vnames, function(x) {
    out <- rep(NA, nrow(df))
    out[df[,x] %% 4 == 0] <- 0
    out[df[,x] %% 4 == 1 | df[,x] %% 4 == 3] <- 1
    out[df[,x] %% 4 == 2] <- 2
    out
}))
names(dout) <- paste(vnames, "cat", sep=".")