1
votes

I need to create a column in which each observation equals the value of the previous observation multiplied by an observation in another column plus 1. I'm trying to create the indx column in the example below. indx[1] is hard coded as 1.000, but indx[2] = indx[1]*(1+chng[2]).

I've been using mutate in the dplyr library to make new columns, but I don't see how I can reference the previous value of a column as I'm creating it.

Edit: Updated the below example to include to reflect that the data values for i and chng reset to 0 and 0.000 respectively after every 5 observations and that indx would need to reset to 1.000 when this happens as well and begin it's accumulation again.

Example data.table:

test <- data.frame(i = c(0,1,2,3,4,0,1,2,3,4)
               ,chng = c(.000,.031,.005,-.005,.017,.000,.012,.003,-.013,-.005,)
               ,indx = c(1,1.031,1.037,1.031,1.048,1,1.012,1.015,1.002,.997))

     i   chng  indx
 1:  0  0.000 1.000
 2:  1  0.031 1.031
 3:  2  0.005 1.037
 4:  3 -0.005 1.031
 5:  4  0.017 1.048
 6:  0  0.000 1.000
 7:  1  0.012 1.012
 8:  2  0.003 1.015
 9:  3 -0.013 1.002
10:  4 -0.005 0.997
2

2 Answers

2
votes

Mathematically this is the same as cumprod(test$chng + 1):

test %>% mutate(indx = cumprod(chng + 1))

giving:

  i   chng     indx
1 0  0.000 1.000000
2 1  0.031 1.031000
3 2  0.005 1.036155
4 3 -0.005 1.030974
5 4  0.017 1.048501

Regarding the updated question, create a grouping variable g and apply the above by group:

test %>%
  group_by(g = cumsum(i == 0)) %>%
  mutate(indx = cumprod(chng + 1)) %>%
  ungroup %>%
  select(-g)
0
votes

As you mentioned, you need the previous value from a column/variable you're creating. This is a sequential process and one option is to use Reduce (instead of a for loop):

test <- data.frame(i = c(0:4)
                   ,chng = c(.000,.031,.005,-.005,.017))

test$indx = Reduce(function(x,y) x*(1+y), test$chng, accumulate = T, init = 1)[-1]

test

#   i   chng     indx
# 1 0  0.000 1.000000
# 2 1  0.031 1.031000
# 3 2  0.005 1.036155
# 4 3 -0.005 1.030974
# 5 4  0.017 1.048501

For the case where i resets you can use this:

test <- data.frame(i = c(0,1,2,3,4,0,1,2,3,4)
                   ,chng = c(.000,.031,.005,-.005,.017,.000,.012,.003,-.013,-.005))

library(tidyverse)

test %>%
  group_by(group = cumsum(i == 0)) %>%   # create a group based on i column
  mutate(indx = Reduce(function(x,y) x*(1+y), chng, accumulate = T, init = 1)[-1]) %>%  # apply the Reduce function to each group
  ungroup() %>%                          # forget the grouping
  select(-group) %>%                     # remove group column
  data.frame()                           # only for visualisation purposes (see the decimals)

#    i   chng      indx
# 1  0  0.000 1.0000000
# 2  1  0.031 1.0310000
# 3  2  0.005 1.0361550
# 4  3 -0.005 1.0309742
# 5  4  0.017 1.0485008
# 6  0  0.000 1.0000000
# 7  1  0.012 1.0120000
# 8  2  0.003 1.0150360
# 9  3 -0.013 1.0018405
# 10 4 -0.005 0.9968313