1
votes

I am trying to run a summation on each row of dataframe. Let's say I want to take the sum of 100n^2, from n=1 to n=4.

> df <- data.frame(n = seq(1:4),a = rep(100))
> df
  n   a
1 1 100
2 2 100
3 3 100
4 4 100

Simpler example:

Let's make fun1 our example summation function. I can pull 100 out because I can just multiply it in later.

fun <- function(x) {
    i <- seq(1,x,1)
    sum(i^2) }

I want to then apply this function to each row to the dataframe, where df$n provides the upper bound of the summation.

The desired outcome would be as follows, in df$b:

> df
  n   a  b
1 1 100  1
2 2 100  5
3 3 100 14
4 4 100 30

To achieve these results I've tried the apply function

apply(df$n,1,phi)

and also with df converted into a matrix

mat <- as.matrix(df)
apply(mat[1,],1,phi)

Both return an error:

Error in seq.default(1, x, 1) : 'to' must be of length 1 

I understand this error, in that I understand why seq requires a 'to' value of length 1. I don't know how to go forward.

I have also tried the same while reading the dataframe as a matrix.

Maybe less simple example:

In my case I only need to multiply the results above, df$b, by 100 (or df$a) to get my final answer for each row. In other cases, though, the second value might be more entrenched, for example a^i. How would I call on both variables, a and n?

Underlying question:

My underlying goal is to apply a summation to each row of a dataframe (or a matrix). The above questions stem from my attempt to do so using seq(), as I saw advised in an answer on this site. I will gladly accept an answer that obviates the above questions with a different way to run a summation.

1
often you can vectorise, in this case cumsum(df$n^2)user20650

1 Answers

2
votes

If we are applying seq it doesn't take a vector for from and to. So we can loop and do it

df$b <- sapply(df$n, fun)
df$b
#[1]  1  5 14 30

Or we can Vectorize

Vectorize(fun)(df$n)
#[1]  1  5 14 30