3
votes

Background:

I'm running a Monte Carlo simulation to show that a particular process (a cumulative mean) does not converge over time, and often diverges wildly in simulation (the expectation of the random variable = infinity). I want to plot about 10 of these simulations on a line chart, where the x axis has the iteration number, and the y axis has the cumulative mean up to that point.

Here's my problem:

I'll run the first simulation (each sim. having 10,000 iterations), and build the main plot based on its current range. But often one of the simulations will have a range a few orders of magnitude large than the first one, so the plot flies outside of the original range. So, is there any way to dynamically update the ylim or xlim of a plot upon adding a new set of points or lines?

I can think of two workarounds for this: 1. store each simulation, then pick the one with the largest range, and build the base graph off of that (not elegant, and I'd have to store a lot of data in memory, but would probably be laptop-friendly [[EDIT: as Marek points out, this is not a memory-intense example, but if you know of a nice solution that'd support far more iterations such that it becomes an issue (think high dimensional walks that require much, much larger MC samples for convergence) then jump right in]]) 2. find a seed that appears to build a nice looking version of it, and set the ylim manually, which would make the demonstration reproducible.

Naturally I'm holding out for something more elegant than my workarounds. Hoping this isn't too pedestrian a problem, since I imagine it's not uncommon with simulations in R. Any ideas?

2
I just wonder: have you any memory issues? 10 vectors of 10.000 isn't a lot. As I check: X<-lapply(1:10,function(i) rnorm(100000,0,1000)); object.size(X)/1024/1024 is just 7MB of RAM. So 1. should be ok.Marek
No, good point - I'm definitely NOT running into memory issues (hence laptop-friendly) with this simulation, but I'll be demonstrating far more complicated [Q]MC[MC] simulations in the future, with the same output of a graph. I'm looking for something that in general wouldn't rely on too much storage, especially as things get more complicated and I need far larger MC sample sizes to ensure convergence. This may be unavoidable / I might be overestimating the difficulty of implementing said future simulations.HamiltonUlmer

2 Answers

5
votes

I'm not sure if this is possible using base graphics, if someone has a solution I'd love to see it. However graphics systems based on grid (lattice and ggplot2) allow the graphics object to be saved and updated. It's insanely easy in ggplot2.

require(ggplot2)

make some data and get the range:

foo <- as.data.frame(cbind(data=rnorm(100), numb=seq_len(100)))

make an initial ggplot object and plot it:

p <- ggplot(as.data.frame(foo), aes(numb, data)) + layer(geom='line')
p

make some more data and add it to the plot

foo <- as.data.frame(cbind(data=rnorm(200), numb=seq_len(200)))

p <- p + geom_line(aes(numb, data, colour="red"), data=as.data.frame(foo))

plot the new object

p
0
votes

I think (1) is the best option. I actually don't think this isn't elegant. I think it would be more computationally intensive to redraw every time you hit a point greater than xlim or ylim.

Also, I saw in Peter Hoff's book about Bayesian statistics a cool use of ts() instead of lines() for cumulative sums/means. It looks pretty spiffy:

alt text