I am running a simple genetic drift simulation in R.
# Population size
N<-5000
# Number with focal allele
X1<-(N/2)
# Number of generations
ngens<-(2000)
# Number of replicates
nreps<-10
# Drift function
drift <- function(N, X1, ngens, nreps) {
# Makes a matrix of NA's of nreps columns, and ngen rows
p <- matrix(NA, nrow=ngens, ncol=nreps)
# Set base population
p[1,] <- X1/N
# Repetitive sampling function, each generation sample 10 times from the generation before (gen-1)
for(gen in 2:ngens)
p[gen,] <- rbinom(n=nreps, size=N, prob=p[gen-1,]) / N
p
}
# Run function "drift" & output as data frame
p <- data.frame(drift(N, X1, ngens, nreps))
# Plot
matplot(p, type="l", ylim=c(0,1), lty=1, xlab="Generations", ylab="Proportion Focal",col="grey")
# Mean value
p$mean<-apply(p[,c(1:10)],1,mean)
matplot(p$mean, type="l", ylim=c(0,1), lty=1, xlab="Generations", ylab="Proportion Focal",col="black",add=T)
I would like to:
- Calculate confidence intervals (by bootstrapping) around the mean for each generation
- Add a pair of columns to my dataframe with the upper and lower confidence interval estimates which can then be plotted on the matplot just like I have done with the mean value
Can anyone suggest a way to do this? I am aware I need the Boot package and know roughly how to use this but guidance would be good.
The problem (for me) is getting a loop which generates a CI for each generation of the simulation and pasting that to the "p" dataframe
EDIT:
I have tried this as partly suggested by @bakyaw and the "for" loop adapted from an old script I used once.
myBootFunction<-function(x){
b <- boot(x, function(u,i) mean(u[i]), R = 999)
boot.ci(b, type = c("norm"))
}
meanList<-apply(p[c(2:ngens),c(1:nreps)],1,function(x)myBootFunction(x))
for(i in 1:49) {
low<-meanList[[i]][[4]][[2]]
high<-meanList[[i]][[4]][[3]]
CIMatrix<-cbind(high,low)}
Note the addition of c(2:ngens), without out it this error comes up.
[1] "All values of t are equal to 0.5 \n Cannot calculate confidence intervals"
However, this still only creates the CIMatrix as a 1x2 double matrix, rather than one with every generation in it.
sapply
statement. such aslast_output<-sapply(1:nrow(p),function(x)yourBootFunction(x))
So that you can get the row n and boot it. This will return alist
with the confidence intervals of each row, at the length of row count ofp
. does it help? – bakyawp
, the first row was 0.5 for all the columns. That is the warning for the first line. The rest should be calculated (at least for me it worked). As all the values are the same, there is no confidence interval for the first row. – bakyaw