Estimating probability of events through simulation

Question

The question is this: each person checks their hat at the door but then for some reason, the hat check attendant messes up the record and cannot remember which hat belongs to whom. The attendant decides to return the hats to everyone randomly.

I want to conduct a stimulation in R that estimates:

The probability that no one gets the hat back correctly.
The probability that at least one patron gets his or her own hat back.
Average number of patrons who get their own hats back.

For the simulation, let's set n=5.

I am thinking about assigning

hats <- c(1:5)
patrons <- c(1:5)

and make a function (a,b){a-b}

I got a little bit confused about how the R works, because I used to use Python and they have a different structure. But my thought process is like this:

patrons = float(input("How many people attend? "))
def number_of_patrons_assign:
   for i in patrons:
       return i
def number_of_hats_assign:
   for r in hats:
       return r
def counting:
   list=[]
   if number_of_patrons_assign == number_of_hats_assign
      return list

Sorry this might be wordy, but I haven't use Python for a year.

My partner uses R language and wrote this :

hats <- c(1:5)
patrons <- c(1:5)
vector <- NULL
test <- function(a, b)
{
  a-b
}
p <- 0
for(n in 1:10)
{
  x <- sample(hats, 5, replace = FALSE)
  y <- sample(patrons, 5, replace = FALSE)
  test(x, y)
  vector[n] <- c(if(test(x, y)==0) p <- 0,
  ifelse(test(x, y)==9, p <- 0, P <- 1))
}

I don't get what the function of the NULL there and how that works? This works but doesn't really look like what we are looking for.

vector <- NULL is just a bad way to initialize a vector vector to be filled up in the for loop — Rich Scriven

jbaums jbaums · Accepted Answer · 2014-10-17T08:07:18

Here's a very simple approach:

set.seed(1)
n <- 5
rowMeans(replicate(100000, {
  myhat <- sample(n) == seq_len(n)  
  c(all.correct=all(myhat), any.correct=any(myhat), n.correct=sum(myhat))
}))

# all.correct any.correct   n.correct 
#     0.00838     0.63163     0.99769

First, sample(n) randomly permutes the numbers 1:n, and we then compare each element to the corresponding element in the vector 1:n. This returns a logical vector of length n, and we can imagine that TRUE represents a correctly assigned hat, and FALSE reflects an incorrectly assigned hat. We assign this logical vector to myhat.

We then perform three tests on this vector.

First, are all the hats correctly assigned? If so, all elements of myhat will be TRUE, and thus all(myhat) will also be TRUE. Second, are any of the hats correctly assigned? If so, at least one element of myhat will be TRUE, and so any(myhat) will be TRUE. Finally, how many hats are correctly assigned? In R, logical vectors are treated as numeric (TRUE = 1, FALSE = 0) if necessary, so sum(myhat) returns the number of correctly assigned hats.

We replicate this a large number of times, and the resulting means of each of the three vectors is an estimate of the probability of the three cases.

We can now compare these to the truth, calculated analytically:

# All correct:
1/factorial(n)
# [1] 0.008333333

# Any correct
1 - round(factorial(n)/exp(1))/factorial(n)
# [1] 0.6333333

# Average number correct
1/n * n
# [1] 1

For reference, the second case (any hats correct) is calculated as the 1 minus the probability that all hats are incorrectly assigned. The probability that all hats are incorrectly assigned is the number of "derangements" divided by the total number of possible permutations. The third case (average number correct) is the sum of the expectations of each hat being assigned correctly (see here). There's a related post at math.stackexchange that might be of interest.

Estimating probability of events through simulation

1 Answers