4
votes

I have a directed weighted graph G=(V,E).

In this graph the weight of edge(v[i],v[j]) is the count of transition between v[i] and v[j].

I am trying to determine the best way to accomplish task: how to find the probability P of path from node A to node B

All weights are positive integers.

For example,

weight(A,B)=count of transition from A to B
weight(B,C)=count of transition from B to C
weight(B,D)=count of transition from B to D
weight(D,C)=count of transition from D to C

And we have several paths:

A->B->C 
A->B->D->C

So, how to calculate probability P of these paths and choose the best one?

1
It's not clear where probabilities are supposed to come from here. What do you mean by "count of transition"? And what makes one path the best? You haven't given us a clear enough problem statement for us to answer the question. - user2357112 supports Monica
@user2357112 For example, the count of transition - the number of transitions from A to the place B . The best path is a path with the most probability. My attempt to solve this: P=the sum of weights of the edges belonging to this path. But I have a problem that the path with the greatest length will have the greatest probability, but It is wrong - Andrei

1 Answers

4
votes

It can be solved by reducing the problem to shortest path problem, assuming we are indeed discussing probabilities (that means, each weight is in range [0,1]).

Let the graph be G=(V,E), and the probability between two vertices denoted as w(u,v).

Define: w'(u,v) = -log(w(u,v))

The shortest path from some node s to some node t is then the shortest path in the graph when using w' as weight function. You can find the shortest path using Dijkstra's algorithm or Bellman-Ford algorithm

Proof:

For a path v1->v2->...->vn, its probability is w(v1,v2) * w(v2,v3) * ... * w(vn-1,vn).

When using w' as the weight, the cost of this path when using any shortest path algorithm is:

d(v1,vn) = w'(v1,v2) + w'(v2,v3) + ... + w'(vn-1,vn) = 
d(v1,vn) = -log(w(v1,v2)) + -log(w(v2,v3) + ... + -log(w(vn-1,vn)) = 
d(v1,vn) = -1* [ log(w(v1,v2)) + log(w(v2,v3)) + ... + log(w(vn-1,vn)) =
d(v1,vn) = -1* log(w(v1,v2) * w(v2,v3) * ... * w(vn-1,vn)) 

This obviously applies also for the minimal path found from s to t.
This means that this path have minimal value of:

s(s,t) = -1* log(w(s,v2) * w(v2,v3) * ... * w(vn-1,t)) 

And since log is monotonic function, it is also minimal value of -1 * w(s,v2) * w(v2,v3) * ... * w(vn-1,t), which is the maximal value of w(s,v2) * w(v2,v3) * ... * w(vn-1,t), and that's exactly the probability.