1
votes

(Apologies, if the title is not accurate/useful, I'm not sure what else to call it... Ideas welcome...)

Let's say I have a game that consists of several states S1, S2, S3, ... and coin-tosses that transition you from one state to some other state. There also is a state W where you win and a state L where you loose. Games always start in state S1. What is the probability Pwin(S1) of winning such a game.

As an example, let's take the following rules to the game:

  • S1: Heads brings you to S2, tails brings you to S3
  • S2: Heads brings you to S3, tails brings you to L
  • S3: Heads brings you to L, tails brings you to W

Now, if I need to figure out what the overall chance of winning the game is (given fair coin-tosses), I can simply start at the bottom:

  • Pwin(S3) = 0.5 * 0% + 0.5 * 100% = 50%
  • Pwin(S2) = 0.5 * Pwin(S3) + 0.5 * 0% = 25%
  • Pwin(S1) = 0.5 * Pwin(S2) + 0.5 * Pwin(S3) = 37.5%

The problem comes in when I, for example, replace the last rule with this:

  • S3: Heads brings you back to S1, tails brings you to W

Notice, how this creates a circular reference where Pwin(S3) depends on Pwin(S1) and vice versa.

I am looking for an algorithm that solves for Pwin(S1) for any possible rule-set for an arbitrary number of states and for "coins" that have more than 2 sides (i.e. each state transitions to a random choice among several possible following states including immediate loop-back). I might even be faced with a situation where the "coins" aren't fair, i.e. the probabilities to transition to the next states are not all equal.

I think I remember something that this can be solved with a matrix equation, but I'm not even sure what to call this problem to do a real Google search for an answer... I don't even know what tags to pick. :)

Any pointers would be much appreciated.

Given all values are probabilities that sum to 1, I have a feeling that this problem should always have one unique solution. Is that correct?

1

1 Answers

2
votes

You're describing a Markov chain model. You'll want to set up the state transition matrix P, and then the long-run proportion of time spent in each state, π, obeys the relationship πP=π. If I recall correctly, when you have absorbing states such as "win" or "lose", π should converge to zeros for all other states and the probabilities of win/lose for those two absorbing states.