In R: I am trying to figure out a way to generate vectors with values 0 or 1. Rather than drawing each 0 and 1 independtly from a uniform distribution I would like the 1s to come clustered e.g. (1,0,0,0,0,0,1,0,1,1,1,1,0,1,0,0,0,0,1,0,0,0,...). In its most simple form something like: "if the previous number was 1 then increase the likelihood of drawing 1". Or make the chance of drawing 1 be dependent of the sum of the last say 5 numbers drawn. Is there an efficient way of doing this, maybe even a package. Would be reminiscent of rbinom(n,1,prob)
with variable prob
.
2 Answers
You can try the following method using a loop. First you can create a variable called "x" using sample which will assign an initial value of 0 or 1.
Within the loop you can use the sample function again, but this time you assign values to the prob option. For this purpose I've set the probability to 70/30 split (ie if your previous number was a 0, there is a 70% chance that the next number will be a 0 and vice versa if your previous value was 1.)
x = sample(c(0,1),1)
for(i in 2:100){
if(x[i-1] == 0){
x[i] = sample(c(0,1),1,prob=c(0.7,0.3))
} else {
x[i] = sample(c(0,1),1,prob=c(0.3,0.7))
}
}
x[1:20]
[1] 1 1 1 0 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 0
So I took good inspiration from Colin Charles, and added a little adjustability. There are obviously many ways to compute prob as being influenced by prior draws. I ended up using a cutoff m
of the sum of the last w
draws to determine whether to use low prob p0
or high prob p1
for each 0/1 to make vector of length l
.
f <- function (l, w, m, p0, p1){
v = rbinom(w,1,p0) #Initilize with p0
for (i in w:(l-1)){
v[i+1] <- ifelse(sum(v[(i-w+1):i]) > m,
rbinom(1,1,p1),
rbinom(1,1,p0))
}
return(v)
}
#Test:
set.seed(8)
plot(f(100, 5, 1, 0.1, 0.6)) #Clustered
plot(f(100, 5, 2, 0.1, 0.4)) #Less clustered
Gives:
and (less clustered):