8
votes

There is an island which is represented by square matrix nxn.

A person on the island is standing at any given co-ordinates (x,y). He can move in any direction one step right, left, up, down on the island. If he steps outside the island, he dies.

Let the island be represented as (0,0) to (n-1,n-1) (i.e nxn matrix) & person is standing at given co-ordinates (x,y). He is allowed to move n steps on the island (along the matrix). What is the probability that he is dead after he walks n steps on the island?

What should be the approach to find the probability using programming techniques?

I have a mathematical method, but I don't know whether it's correct or not. Here it is:

The total number of outcomes are n^n. To calculate the number of outcomes which can lead to death of the person:

For each of the four directions, check how many steps can lead to him going out of the matrix. Then, apply the high school probability formula. For e.g. suppose the total number of steps he can take are 5; (x, y) = (2,1) [indexing is 0-based]. So, he needs to take 3 steps in north dir. to fall out of island. Keeping them in a group: (NNN) and making other 2 steps as any of the 4 choices, we have the formula: 4*4*3. Similarly, for other 3 directions. Finally, the probality = (sum of the calculated death outcomes) / (total outcomes)

This was a Google interview question.

5
Surely the winning strategy is to stand still, whereby the probability of dying is 0. The problem, as you have stated it, omits any incentive for the man to take any steps at all, and it is completely obscure why he should step into any danger.High Performance Mark
@HighPerformanceMark I don't think it's about 'winning', just basically about a third party calculating the probability that the man dies if he takes n random steps (from what I can tell).Bernhard Barker
@Dukeling: I think you must be misinterpreting the question. Surely if OP had meant to ask a question about a random walk on a lattice (s)he would have done so !? No, I think that this is more a question about finding a strategy for staying alive longest.High Performance Mark
@HighPerformanceMark "What is the probability that he is dead after he walks n steps on the island?" This doesn't say you have to pick the steps (fair enough, it doesn't say anything about walking randomly either). Also, OP's attempt at the problem seems to indicate simply calculating the probability. "random walk on a lattice" is reasonably advanced terminology and we both know questions are often not stated exactly as meant. But arguing any more over assumptions made won't be constructive, OP has to clarify.Bernhard Barker
Sankalp, please edit question and say if n and N are related. Also say if each step's direction is random and the four directions are equally likely.James Waldby - jwpat7

5 Answers

9
votes

TL;DR: Recursion. (Or "mathematical induction", if you're snobbish.)

(In what follows, "he is dead after he walks n steps on the island" is assumed to mean "he dies after less than or equal to n steps". If you take it to mean "he dies after exactly n steps", the answer will be slightly different. I'll discuss it briefly at the end.)

We have an NxN matrix where the value in each cell represents the probability of dying in n steps if we started from that cell.

Consider the probability of dying in 0 steps. Clearly, this is 0.0 for every location inside the island, and 1.0 everywhere outside it.

What's the probability of dying in 1 steps? You have four directions you can move in, with equal probability. So for each cell, you take its four neighbors, find their probability of dying in 0 steps, and average them together. (If a neighbor is outside the matrix, you consider its probability to be 1.0.)

Similarly, the probability of dying in k steps starting from a given cell is the average of the probability of dying in k-1 steps starting from its neighbour cells.

Python code:

from itertools import product as prod 

def prob_death(island_size, steps):
    if island_size < 1 or steps < 0: raise ValueError
    new_prob = [[0. for i in range(island_size)] for j in range(island_size)]
    if steps == 0:
        return new_prob
    old_prob = prob_death(island_size, steps - 1)
    directions = [(0, -1), (1, 0), (0, 1), (-1, 0)]
    for (i, j, direction) in prod(range(island_size), range(island_size), directions):
        neighbor_i = i + direction[0]
        neighbor_j = j + direction[1]
        if neighbor_i >= 0 and neighbor_i < island_size and \
                neighbor_j >= 0 and neighbor_j < island_size:
            prob_death_this_way = old_prob[neighbor_i][neighbor_j]
        else: # neighbor is outside the island 
            prob_death_this_way = 1.
        new_prob[i][j] += 0.25* prob_death_this_way
    return new_prob

Now, let's test it out a bit: (mpr is just a function for printing matrices nicely)

>>> mpr(prob_death(5, 0))
0.000000 0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000 0.000000

As expected: You can't die in 0 steps if you start inside the island.

>>> mpr(prob_death(5,1))
0.500000 0.250000 0.250000 0.250000 0.500000
0.250000 0.000000 0.000000 0.000000 0.250000
0.250000 0.000000 0.000000 0.000000 0.250000
0.250000 0.000000 0.000000 0.000000 0.250000
0.500000 0.250000 0.250000 0.250000 0.500000

This is what we'd expect. If you start at a corner cell, you have 0.5 probability of dying in 1 step: 2 out of your 4 neighbors are outside the island. If you start on an edge, only 1 neighbor is outside, so your probability of dying is 0.25. Everywhere else, all neighbors are inside the island, so probability of dying in 1 step is 0.0.

>>> mpr(prob_death(5, 5))
0.806641 0.666016 0.622070 0.666016 0.806641
0.666016 0.437500 0.349609 0.437500 0.666016
0.622070 0.349609 0.261719 0.349609 0.622070
0.666016 0.437500 0.349609 0.437500 0.666016
0.806641 0.666016 0.622070 0.666016 0.806641

The probability of dying in 5 steps. I can't verify the exact values, but it looks about right: The probability of dying is highest in the corners, a little lower at the edges, and decreases steadily inwards.

That solves the problem of dying in less than or equal to n steps.

Now, to find the probability of dying in exactly n steps: Let the probability of dying in less than or equal to n steps starting from (x,y) be denoted by P(x,y,n). Then the probability of dying in exactly n steps is the probability of surviving for n-1 steps, times the probability of dying in the nth step given that we survived for n-1 steps: (1-P(x,y,n-1))*(P(x,y,n) - P(x,y,n-1)). (I'm not all that sure about this formula; correct me if I'm wrong.)

1
votes

First, start with a matrix with the probability of being in the square (x, y) in the 0th step. Let's simulate it with a 4x4 matrix. Assuming the guy starts at (1, 2):

After 0 steps:
  0.00%   0.00%   0.00%   0.00%
  0.00%   0.00% 100.00%   0.00%
  0.00%   0.00%   0.00%   0.00%
  0.00%   0.00%   0.00%   0.00%
outside:   0.00%
----
After 1 steps:
  0.00%   0.00%  25.00%   0.00%
  0.00%  25.00%   0.00%  25.00%
  0.00%   0.00%  25.00%   0.00%
  0.00%   0.00%   0.00%   0.00%
outside:   0.00%
----
After 2 steps:
  0.00%  12.50%   0.00%  12.50%
  6.25%   0.00%  25.00%   0.00%
  0.00%  12.50%   0.00%  12.50%
  0.00%   0.00%   6.25%   0.00%
outside:  12.50%
----
After 3 steps:
  4.69%   0.00%  12.50%   0.00%
  0.00%  14.06%   0.00%  12.50%
  4.69%   0.00%  14.06%   0.00%
  0.00%   4.69%   0.00%   4.69%
outside:  28.12%
----
After 4 steps:
  0.00%   7.81%   0.00%   6.25%
  5.86%   0.00%  13.28%   0.00%
  0.00%   9.38%   0.00%   7.81%
  2.34%   0.00%   5.86%   0.00%
outside:  41.41%
----

Here's a python program that calculates this:

class Table:
    def __init__(self, n, outside=0):
        self.T = [[0]*n for i in xrange(n)]
        self.outside = outside

    def add(self, i, j, value):
        n = len(self.T)
        if 0<=i<n and 0<=j<n:
            self.T[i][j] += value
        else:
            self.outside += value

    def make_next(self):
        n = len(self.T)
        Q = Table(n, self.outside)

        for i in xrange(n):
            for j in xrange(n):
                value = self.T[i][j] / 4.0
                Q.add(i-1, j, value)
                Q.add(i+1, j, value)
                Q.add(i, j-1, value)
                Q.add(i, j+1, value)
        return Q

    def __repr__(self):
        return '\n'.join(' '.join(
                    '{:6.2f}%'.format(item*100) 
                    for item in line)
                    for line in self.T) + \
               '\noutside: {}'.format('{:6.2f}%'.format(self.outside*100))


N = 4
T = Table(N)
T.add(1, 2, 1)

for k in xrange(N+1):
    print 'After {} steps:'.format(k)
    print T
    print '----'

    T = T.make_next()
0
votes

This is a very complex and nuanced problem -- I suspect the goal of the interviewers wasn't to hear you come up with the answer, but rather to see how you approach the problem.

The problem becomes a checkerboard n squares on a side, with an apparently random placement of a piece which must move n spaces, moving in an apparently random cardinal direction each turn. The probability that the piece will leave the board is thus related not merely to the size of the board, but the placement of the piece. Since any move off of the board also counts as leaving the board, the path the piece takes is therefore also relevant.

For a 2×2 grid, the piece has a 2/7 probability of staying on the board (four paths stay on the board, fourteen total paths, irrespective of which of the four possible starting points).

For a 3×3 grid, the piece has a 2/11 probability of staying on the board (16 paths stay on, 88 total paths) if it starts on a corner. If it starts on a side, it has a 3/11 probability (24 paths stay on). If it starts in the center, it has a 9/22 probability (36 paths stay on). Since the piece has a 4/9 probability each for starting in a corner or on a side, and a 1/9 probability for starting in the center, its total probability of staying on the board is (2/11 + 3/11) × 4/9 + 9/22 × 1/9 = 0.247.

The 4×4 grid becomes (obviously) yet more complicated, but it is perhaps worth noting that the grids do conform to a pattern:

2×2:

- - 1 - -
- 2 - 2 -
1 - 2 - 1
- 2 - 2 -
- - 1 - -

3×3:

- - - 1 - - -
- - 3 - 3 - -
- 3 - 9 - 3 -
1 - 9 - 9 - 1
- 3 - 9 - 3 -
- - 3 - 3 - -
- - - 1 - - -

I started on the 4×4 grid, but no thanks. The starting square appears to have 36 paths leading to it, and actually identifying them is, well, tedious. In every case, the piece starts in the center of the pattern, and we can draw the board as we see fit.

There is clearly a pattern, however, and it fairly cries out that there is a mathematical symmetry, but I have neither the time nor patience to work it out. The poles each have one path, the next group of endpoints have n paths, and the next group appear to have n2 paths. I suspect I've made a mistake in counting the innermost set of endpoints for the 4×4 grid, but if I haven't, 36 = n(n - 1)2, which strongly suggests a pattern for the rings.

Anyway, as I said, the problem is very complicated, and almost certainly your approach was being judged, not your ability to answer it. Still, it was a fun exercise.

0
votes

Approach should be relying on probabilty formula - no of favorable cases/total number of case

Given coordinate (x,y) and steps - n Total number of ways user can take steps -
1st step - 4 ways 2nd step - 4 * 3 (assuming he cannot step backwards) 3rd step - 4* 3^2 ... .... ... nth step - 4*3^(n-1) Arthmetic Progression will give total steps.

Farovable cases - i.e. crossing the boundries - recursive fuction with recurse on all 4 direction and incr variable count whenever matrix boundry is crossed.

Divide both to get the answer.

0
votes

Thanks Anubhav C for great solution above, which was very helpful for illuminating the problem's solution. I think using 0.25 as probability (mentioned above by ) can be misleading and incorrect! If we look at the probability #dead_cases/#total_possible_moves the results will be different.

Consider the following code for finding die/surviving cases:

def winLoss_stat(N, steps):
    newStats = [[[0, 0, 0] for i in range(N)] for j in range(N)]
    if steps==0:
        newStats = [[[1, 0, 0] for i in range(N)] for j in range(N)]
        return newStats
    oldStats = winLoss_stat(N, steps-1)
    for i in range(N):
        for j in range(N):
            for d in [(0, 1), (0, -1), (1, 0), (-1, 0)]:
                indX = i + d[0]
                indY = j + d[1]
                if indX >=0 and indX < N and indY >= 0 and indY<N:
                    newStats[i][j][0] += oldStats[indX][indY][0]
                    newStats[i][j][1] += oldStats[indX][indY][1]
                    newStats[i][j][2] += oldStats[indX][indY][2]
                else:
                    newStats[i][j][1] += 1
                    if steps==1:
                        newStats[i][j][2] = 1
    return newStats

(or equivalently, for one step (using dfs - recursive):

class winLoss:
    def __init__(self, N):
        self.win = 0 
        self.loss = 0
        self.N = N
    def winLoss(self, x, y, n):
        if x < 0 or y < 0 or x >= self.N or y >= self.N:
            self.loss += 1
            return
        if n == 0:
            self.win += 1
            return
        self.winLoss(x - 1, y, n-1)
        self.winLoss(x, y - 1, n-1)
        self.winLoss(x+1, y, n-1)
        self.winLoss(x, y+1, n-1)



    wl = winLoss(n)
    wl.winLoss(i, j, n)
for any i,j start point and n (size of square)
)

The winLoss_stat returns three values for starting point at each square i, j: 
[numbers of survive cases, numbers of die cases before or at k steps, numbers of death exactly at step k]

The results are as the following for n=4 (4X4), steps=4:

              0              1              2             3
0  [58, 24, 12]   [93, 34, 18]   [93, 34, 18]  [58, 24, 12]
1  [93, 34, 18]  [150, 46, 28]  [150, 46, 28]  [93, 34, 18]
2  [93, 34, 18]  [150, 46, 28]  [150, 46, 28]  [93, 34, 18]
3  [58, 24, 12]   [93, 34, 18]   [93, 34, 18]  [58, 24, 12]

This translates to the following probabilities for 
1. death before or at k steps:
          0         1         2         3
0  0.292683  0.267717  0.267717  0.292683
1  0.267717  0.234694  0.234694  0.267717
2  0.267717  0.234694  0.234694  0.267717
3  0.292683  0.267717  0.267717  0.292683

2. death exactly at k steps:
          0         1         2         3
0  0.146341  0.141732  0.141732  0.146341
1  0.141732  0.142857  0.142857  0.141732
2  0.141732  0.142857  0.142857  0.141732
3  0.146341  0.141732  0.141732  0.146341

The results can be verified by looking at the numbers of win-loss from step 1 to 3 for n=3:

winLoss_stat(3, 1)
           0          1          2
0  [2, 2, 1]  [3, 1, 1]  [2, 2, 1]
1  [3, 1, 1]  [4, 0, 0]  [3, 1, 1]
2  [2, 2, 1]  [3, 1, 1]  [2, 2, 1]

winLoss_stat(3, 2)
           0           1          2
0  [6, 4, 2]   [8, 5, 2]  [6, 4, 2]
1  [8, 5, 2]  [12, 4, 4]  [8, 5, 2]
2  [6, 4, 2]   [8, 5, 2]  [6, 4, 2]

winLoss_stat(3, 3)
             0            1            2
0  [16, 12, 4]  [24, 13, 8]  [16, 12, 4]
1  [24, 13, 8]  [32, 20, 8]  [24, 13, 8]
2  [16, 12, 4]  [24, 13, 8]  [16, 12, 4]