
In R: I am trying to figure out a way to generate vectors with values 0 or 1. Rather than drawing each 0 and 1 independtly from a uniform distribution I would like the 1s to come clustered e.g. (1,0,0,0,0,0,1,0,1,1,1,1,0,1,0,0,0,0,1,0,0,0,...). In its most simple form something like: "if the previous number was 1 then increase the likelihood of drawing 1". Or make the chance of drawing 1 be dependent of the sum of the last say 5 numbers drawn. Is there an efficient way of doing this, maybe even a package. Would be reminiscent of rbinom(n,1,prob) with variable prob.

Just a thought, maybe generate a series of sequences, randomly or alternatingly either all 0 or all 1, whose length is given by a probability distribution (normal, poisson, etc...). Then concatenate them all together. You would be able to dictate the length and you will get cluster lengths distributed based on the probability you choseChris

2 Answers


You can try the following method using a loop. First you can create a variable called "x" using sample which will assign an initial value of 0 or 1.

Within the loop you can use the sample function again, but this time you assign values to the prob option. For this purpose I've set the probability to 70/30 split (ie if your previous number was a 0, there is a 70% chance that the next number will be a 0 and vice versa if your previous value was 1.)

x = sample(c(0,1),1)
for(i in 2:100){
  if(x[i-1] == 0){
    x[i] = sample(c(0,1),1,prob=c(0.7,0.3))
  } else {
    x[i] = sample(c(0,1),1,prob=c(0.3,0.7))

[1] 1 1 1 0 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 0

So I took good inspiration from Colin Charles, and added a little adjustability. There are obviously many ways to compute prob as being influenced by prior draws. I ended up using a cutoff m of the sum of the last w draws to determine whether to use low prob p0 or high prob p1 for each 0/1 to make vector of length l.

f <- function (l, w, m, p0, p1){

  v = rbinom(w,1,p0) #Initilize with p0

    for (i in w:(l-1)){
      v[i+1] <- ifelse(sum(v[(i-w+1):i]) > m, 


plot(f(100, 5, 1, 0.1, 0.6)) #Clustered
plot(f(100, 5, 2, 0.1, 0.4)) #Less clustered



and (less clustered):
