0
votes

I have a table which contains a list of 'events'. Each event has a mean rate per annum together with two shape parameters for generating a random value from a Beta distribution.

The total average number of events per annum is the sum of the Rate column in the table and it is assumed that this is Poisson distributed.

I want to generate a Monte Carlo simulation such that for each simulation I generate a random number of events from the poisson distribution, then for each randomly generated event I want to draw a random row from the original FreqTable in proportion to the rates therein and merge on the shape parameters from that row. I then want to generate a random loss from the Beta distribution using those shape parameters.

I've provided some reproducible code below showing my initial attempt at this using a for loop. I've got stuck at the point where I need to merge on shape parameters from FreqTable by finding the closest match to the 'rand' value in FreqTable$CumulativeRate. I've tried using detect_index from the purr library but I can't get that to work within a loop.

I'm intending to generate at least 100k simulations when the code is working and I'm aware that a for loop might not be the most efficient way of doing this but (with my limited coding ability) it's all I could think of!

FreqTable <- data.frame(Event=c(1:10),
                        Rate=seq(0.1, 1, length.out = 10),
                        shape1=c(3:12),
                        shape2=c(5:14))


annual_freq <- sum(FreqTable$Rate)
FreqTable$CumulativeRate <- cumsum(FreqTable$Rate)/annual_freq

sims = 10

LossesList <- list()

for (i in 1:sims){
  num_losses <- rpois(1, annual_freq)
  sim_number <- rep(i, num_losses)
  rand <- runif(num_losses, 0, 1)  
  losses_list <- cbind(sim_number, rand)

  # PSEUDO CODE
  #look up shape parameters from FreqTable by using the nearest match on FreqTable$CumulativeRate from the rand value generated above
  #then generate a random variable from the Beta distribution for each row in the losses_list using the looked up shape parameters for that row

  LossesList[[i]] <- losses_list
}

SimulatedResults <- unlist(losses_list)
1
In the past, I've done some Monte Carlo sim work in R using apply functions. are you familiar with them?krfurlong
@KyleFurlong apply is not faster than a well written for loop. Since we have a JIT byte compiler now, it might even be slower.Roland
rpois and friends are vectorized. I would sample sims Poisson numbers at once.Roland

1 Answers

1
votes

Formalizing Roland's suggestion, you don't need to use for loops - relying on vectorized functions will improve performance though.

Let me know if this works for you:

FreqTable <- data.frame(Event=c(1:10),
                        Rate=seq(0.1, 1, length.out = 10),
                        shape1=c(3:12),
                        shape2=c(5:14))


annual_freq <- sum(FreqTable$Rate)
FreqTable$CumulativeRate <- cumsum(FreqTable$Rate)/annual_freq

sims = 10


sim.num <- 1:sims
num_losses <- rpois(sims, annual_freq)
rand.vals <- sapply(num_losses, runif) #list of num_losses[i] random values
sapply(rand.vals, cat)
table.ids2 <- lapply(rand.vals, function(x) {sapply(x, function(el){which.min(abs(FreqTable$CumulativeRate-el))}) })

sim.param.df <- data.frame()
for (i in 1:length(table.ids2)){

  closest.freq.table.event <- table.ids2[[i]]
  beta.shape1 <- sapply(table.ids2[[i]], FUN = function(x){FreqTable[FreqTable$Event == x, "shape1"]})
  beta.shape2 <- sapply(table.ids2[[i]], FUN = function(x){FreqTable[FreqTable$Event == x, "shape2"]})
  sim.id.vec <- rep(i, length(beta.shape1))

  temp.df <- data.frame(
    sim.id = sim.id.vec,
    closest.event = closest.freq.table.event,
    shape1 = beta.shape1,
    shape2 = beta.shape2
  )
  sim.param.df <- rbind.data.frame(sim.param.df, temp.df, make.row.names = FALSE)

}