1
votes

So I'm a little confused by the GraphX pagerank implementation.

https://github.com/apache/spark/blob/feaa07802203b79f454454445c0a12a2784ccfeb/graphx/src/main/scala/org/apache/spark/graphx/lib/PageRank.scala#L115-L160

In particular, line #138, https://github.com/apache/spark/blob/feaa07802203b79f454454445c0a12a2784ccfeb/graphx/src/main/scala/org/apache/spark/graphx/lib/PageRank.scala#L138.

Why isn't page rank for a vertex defined as resetProb + (1.0 - resetProb) * msgSum instead of oldPR + (1.0 - resetProb) * msgSum

Can anyone explain this difference? The links also don't point to master, sorry if anyone is confused by that, however, master still has the same code.

1

1 Answers

1
votes

What is a point to do like this?

resetProb + (1.0 - resetProb) * msgSum

resetProb does not change during algorithm execution (by default it is 0.15). So it is just a constant. Why do you think it make sense to add constant to page rank of every vertex.