0
votes

I have a graph in TinkerPop3 in which the nodes are connected through edges with a given weight.

Given a node A in the graph I'm trying to traverse it to obtain all nodes which can be reached from A with less than N steps, and to sort them by a score. This score is calculated by mapping each edge in the path to that edge's weight divided by the step number and the total step, and finally summing the list.

Here is an example:

gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> a = g.addV('name','a').next()
==>v[0]
gremlin> b = g.addV('name','b').next()
==>v[2]
gremlin> c = g.addV('name','c').next()
==>v[4]
gremlin> d = g.addV('name','d').next()
==>v[6]
gremlin> e = g.addV('name','e').next()
==>v[8]
gremlin> a.addEdge('conn', b, 'weight', 1)
==>e[10][0-conn->2]
gremlin> a.addEdge('conn', c, 'weight', 3)
==>e[11][0-conn->4]
gremlin> b.addEdge('conn', d, 'weight', 3)
==>e[12][2-conn->6]
gremlin> d.addEdge('conn', e, 'weight', 9)
==>e[14][6-conn->8]

The expected scores for nodes related to "a" would be:

score_n = sum_over_n(weight_n-1n / (depth_n * total_N_from_a_to_n))
score_b = w_ab / (1 * 1) = 1 / 1 = 1
score_c = w_ac / (1 * 1) = 3 / 1 = 3
score_d = (w_ab / (1 * 2)) + (w_bd / (2 * 2)) = 1/2 + 3/4 = 5/4
score_e = (w_ab / (1 * 3)) + (w_bd / (2 * 3)) + (w_de / (3 * 3)) = 1/3 + 3/6 + 9/9 = 11/6

And thus, the result should be the corresponding nodes ordered decreasingly:

c e d b

I'm new to TinkerPop and Gremlin and I've been trying a combination of the repeat and sack steps without luck. The plan until now has been to accumulate the step in a sack for each traversal, to emit the score and node, and finally sort by score, but I'm worried about potential performance issues associated to the process.

1
Can you add the queries you tried but didn't work. It may be easier to start from where you are than from scratch. - Alaa Mahmoud
Furthermore it would be beneficial if you could provide a sample graph and an expected result. - Daniel Kuppitz
@DanielKuppitz I've added an example and expected result :) - Jesuspc
@AlaaMahmoud My query is still far from working so it's not worth sharing. Even though I've added an example in the description. - Jesuspc

1 Answers

1
votes

Not the easiest traversal to start with TinkerPop. First thing I've changed is the data type for weights.

graph = TinkerGraph.open()
g = graph.traversal()
a = g.addV('name','a').next()
b = g.addV('name','b').next()
c = g.addV('name','c').next()
d = g.addV('name','d').next()
e = g.addV('name','e').next()
a.addEdge('conn', b, 'weight', 1f)
a.addEdge('conn', c, 'weight', 3f)
b.addEdge('conn', d, 'weight', 3f)
d.addEdge('conn', e, 'weight', 9f)

This was necessary to get rid of rounding errors. It could also be done as part of the traversal, but I didn't want to add even more steps as it already became complex enough:

gremlin> g.withSack(0).V().has("name","a").
           repeat(outE("conn").as("e").values("weight").as("w").
                  sack(assign).by(loops()).sack(sum).by(constant(1)).sack().as("l").
                  select("e").inV().as("v").select("v","w","l").as("x").select("v")).emit().
           select(all, "x").as("x").
           count(local).as("c").select("x").
           map(unfold().sack(assign).by(select("w")).
                        sack(div).by(select("l")).
                        sack(div).by(select("c")).sack().fold()).
           project("score","v").by(sum(local)).by(select("x").tail(local, 1).select("v")).
           order().by(select("score"), decr).select("v").values("name")
==>c
==>e
==>d
==>b

UPDATE

What follows is a working traversal for TP 3.0.1:

gremlin> Gremlin.version()
==>3.0.1-incubating
gremlin> g.withSack(0D).V().has("name","a").
           repeat(outE("conn").as("e").values("weight").as("w").
                  select(all,"w").count(local).as("l").
                  select(last,"e").inV().as("v").select(last,"v","w","l").as("x").select("v")).emit().
           select(all,"x").as("x").
           count(local).as("c").select(last,"x").
           map(unfold().as("m").select("w").sack(sum).
               select(last,"m").select("l").sack(div).
               select(last,"m").select("c").sack(div).sack().fold()).as("score").
           select(last, "score","x").by(sum(local)).by(tail(local, 1).select("v")).
           order().by(select("score"), decr).select("x").values("name")
==>c
==>e
==>d
==>b

The only good thing about this traversal is, that the weight properties don't have to have another data type - int works fine.