0
votes

I have a very large dataframe...

v.l.df <- data.frame(seq(0, 10, 0.0001),seq(0, 10, 0.0001),seq(0, 10, 0.0001))

...and a function with some if statements and calculations...

a.f <- function(cell_value,action){
  if(action == 1){
    cell_value * 1
  }

  else if(action == 2){
    cell_value * 5
  }
}

I now want to apply this function to the first two columns of my v.l.df row by row and build the sums of the returns. The new columns should thus contain (pseudo code):

new_col_1                                    new_col_2
a.f(v.l.df[1,1],1) + a.f(v.l.df[1,2],1)      a.f(v.l.df[1,1],2) + a.f(v.l.df[1,2],2)
a.f(v.l.df[2,1],1) + a.f(v.l.df[2,2],1)      a.f(v.l.df[2,1],2) + a.f(v.l.df[2,2],2)
...

How can this be achieved? I am struggeling with the multiple arguments when using apply and the sum of the returned values form the function.

EDIT: Changed the example function. Should now return the folowing

> a.f(2,1)
[1] 2
> a.f(2,2)
[1] 10
2
What does your function do? As it is coded, it doesn't return anything. Try to run cell.test = 0.7 , action = 1 and a.f(cell.test) should it return 100 or 70?Bernardo
Sorry, I simplified the function in a wrong way. I have changed the function a.f in the example and it should work now.user3347232
is the first element of new_col_2 meant to be a.f(v.l.df[1,1],2) + a.f(v.l.df[1,2],2) ?vpipkt
yes! I corrected the example in the question.user3347232

2 Answers

0
votes

I'd do this in a couple of steps. You can reduce to fewer steps, but I prefer to keep it more readable:

First, apply a.f to all cells two times, using action=1 and action=2 to the first two columns of v.1.df (to pass aditional arguments inside apply, just put them after defining FUN):

action.1 = apply(v.1.df[,1:2], c(1,2), FUN = a.f, action=1)

action.2 = apply(v.1.df[,1:2] ,c(1,2), FUN = a.f, action=2)

Then ppply rowSums to both action.1 and action.2 and store the results in the same data.frame:

v.l.df$new.1 = rowSums(action.1)         #or v.l.df$new.1 = apply(action.1,1,sum)
v.l.df$new.2 = rowSums(action.2)         #or v.l.df$new.1 = apply(action.2,1,sum)
0
votes

I believe your result is achieved by:

v.l.df$new_col_1 <- a.f(v.l.df$V1, 1) + a.f(v.l.df$V2, 1)
v.l.df$new_col_2 <- a.f(v.l.df$V1, 2) + a.f(v.l.df$V2, 2)

Assuming your first two columns are named V1 and V2 respectively.

You may also define another function

a.f.2 <- function(val1, val2, method) {
    a.f(val1, method) + a.f(val2, method)
}

And apply it as follows

v.l.df$new_col_1 <- a.f.2(v.l.df$V1, v.l.df$V2, 1)
v.l.df$new_col_2 <- a.f.2(v.l.df$V1, v.l.df$V2, 2)

You can write this summary function with ... argument, to take an arbitrary number of inputs. The example below expects (and does not check for) columns of a data frame

a.f.n<- function(method,...){
    rowSums(sapply(...,a.f,method))
}

Then apply this as follows:

v.l.df$new_col_1 <- a.f.n(v.l.df[,1:1000], method=1)
v.l.df$new_col_2 <- a.f.n(v.l.df[,1:1000], method=2)

I am not sure how efficient this will be, but it is compact. :-)