2
votes

First time poster - so apologies if this question is basic/poorly explained. Grateful to anyone can help or point me in the right direction!

I would like to do the following within an R dataframe if possible:

Existing Data

Column A is a vector of values, say 10 to 20.

New Data/columns

Column B will be be Column A multiplied by Column C

Column C will be Column C minus Column B from the previous row of data, i.e data$C[-1] - data$B[-1], apart the first row of Column C of course, which I will give a fixed value.

I have tried these as separate steps, but I keep overwriting columns B and C, and have a feeling I have been going about this the wrong way! I could share my code, but I think this would confuse matters!

Thanks in advance!

EDIT TO ADD CODE:

A <- c(0.1,0.2,0.3,0.4,0.5)
df1 <- data.frame(A)

df1$B <- 0 
df1$C <- 0 

df1$C[1] <- 100

df2 <- df1 %>%
  mutate(B = C * A,
         C = lag(C-B))

RESULT FROM THE ABOVE

A B C
1 0.1 10 NA
2 0.2 0 90
3 0.3 0 0
4 0.4 0 0
5 0.5 0 0

EXPECTED OUTPUT

A B C
1 0.1 10 100
2 0.2 18 90
3 0.3 21.6 72
4 0.4 20.16 50.4
5 0.5 15.12 30.24

C2 = C1 - B1 B2 = C2 * A2

2

2 Answers

2
votes

We can use accumulate from purrr to do recursive update

library(dplyr)
library(purrr)
tmp <-  with(df1, accumulate(A,  ~ .x - (.x * .y), .init = first(C)))
df2 <- df1 %>% 
    mutate(C = head(tmp, -1), B = -diff(tmp))
df2
#    A     B      C
#1 0.1 10.00 100.00
#2 0.2 18.00  90.00
#3 0.3 21.60  72.00
#4 0.4 20.16  50.40
#5 0.5 15.12  30.24

Or use base R

tmp <- with(df1, Reduce(function(x, y) x - (x * y), A, 
     accumulate = TRUE, init = C[1]))
df2 <- transform(df1, C = head(tmp, -1), B = -diff(tmp))
2
votes

If you don't mind using a mathematical approach, you can first derive the general expression for the recursion and then have the R code afterwards.

Below is one implementation with base R

transform(
  transform(
    df1,
    C = C[1] * c(1, cumprod(1 - A)[-nrow(df1)])
  ),
  B = A * C
)

which gives

    A     B      C
1 0.1 10.00 100.00
2 0.2 18.00  90.00
3 0.3 21.60  72.00
4 0.4 20.16  50.40
5 0.5 15.12  30.24

A data.table option in a similar manner is

> setDT(df1)[, C := first(C) * c(1, cumprod(1 - A)[-.N])][, B := A * C][]
     A     B      C
1: 0.1 10.00 100.00
2: 0.2 18.00  90.00
3: 0.3 21.60  72.00
4: 0.4 20.16  50.40
5: 0.5 15.12  30.24