2
votes

I have a data.frame that has time series values for a, b, and c. I would like to build a random time series that randomly picks the value of the columns for each row (i.e. date).

So for example, if we have the following df:

df <- data.frame(date = c(as.Date("2018-08-01"),as.Date("2018-09-01"), as.Date("2018-10-01")), a = c(1.0, 1.5, 1.8), b=c(-1.0, -2.0, 3.0), c=c(-2.0, -15.0, 1.7))

 #> df
 #           date   a  b     c
 #   1 2018-08-01 1.0 -1  -2.0
 #   2 2018-09-01 1.5 -2 -15.0
 #   3 2018-10-01 1.8  3   1.7

A possible random sample would look like (in this case picked a for the first month, b for the second, and c for the third).

df.random.sample <- data.frame(date = c(as.Date("2018-08-01"),as.Date("2018-09-01"), as.Date("2018-10-01")), random = c(1.0, -2.0, 1.7))

#> df.random.sample
#        date random
#1 2018-08-01    1.0
#2 2018-09-01   -2.0
#3 2018-10-01    1.7

Most importantly, I have many different columns so would like this to work with column indexes so I do not need to specify each column name.

1

1 Answers

2
votes

If we want to sample by row, then use apply

cbind(df[1], random = apply(df[-1], 1, sample, size = 1))

Or use a vectorized approach with row/column indexing

cbind(df[1], random = df[-1][cbind(seq_len(nrow(df)), sample(2:ncol(df))-1)])