1
votes

Given the following sample data, I'd like to construct a Gremlin query which returns Alice's network of ruby connections, 3 levels deep:

Vertex: Alice
Vertex: Bobby
Vertex: Cindy
Vertex: David
Vertex: Eliza

Edge: [Alice] -> [Rates(tag:ruby,value:0.9)] -> [Bobby]
Edge: [Bobby] -> [Rates(tag:ruby,value:0.8)] -> [Cindy]
Edge: [Cindy] -> [Rates(tag:ruby,value:0.7)] -> [David]
Edge: [David] -> [Rates(tag:ruby,value:0.6)] -> [Eliza]   # ignored, level 4
Edge: [Alice] -> [Rates(tag:java,value:0.9)] -> [Eliza]   # ignored, not ruby

So the returned data should be something like:

Bobby: [0.9]
Cindy: [0.9, 0.8]
David: [0.9, 0.8, 0.7]

Where each vertex ID is returned, along with an array of the path of rating values.

I'm working in the current release of JanusGraph (Gremlin 3). I'm pretty new to Gremlin; I've been puzzling over a few recipes which have things in common with my desired query, but I still don't see quite how to get there...

Thanks very much for any help or advice you can offer.

1

1 Answers

4
votes

When asking Gremlin questions it's always helpful to those trying to answer if you provide a sample graph that can be easily cut and paste into the Gremlin Console like this:

graph = TinkerGraph.open()
g = graph.traversal()
g.addV().property('name','alice').as('a').
  addV().property('name','bobby').as('b').
  addV().property('name','cindy').as('c').
  addV().property('name','david').as('d').
  addV().property('name','eliza').as('e').
  addE('rates').property('tag','ruby').property('value',0.9).from('a').to('b').
  addE('rates').property('tag','ruby').property('value',0.8).from('b').to('c').
  addE('rates').property('tag','ruby').property('value',0.7).from('c').to('d').
  addE('rates').property('tag','ruby').property('value',0.6).from('d').to('e').
  addE('rates').property('tag','java').property('value',0.9).from('a').to('e').iterate()

Using this graph I came up with this approach to getting the result you desire:

gremlin> g.V().has('name','alice').
......1>   repeat(outE().has('tag','ruby').inV()).
......2>     times(3).
......3>     emit().
......4>   group().
......5>     by('name').
......6>     by(path().
......7>        unfold().
......8>        has('value').
......9>        values('value').
.....10>        fold())
==>[bobby:[0.9],cindy:[0.9,0.8],david:[0.9,0.8,0.7]]

Following up through line 3 with the emit() is probably pretty self-explanatory - find "alice" then traverse out() repeatedly to a depth of 3 and emit each vertex discovered along the way. That gets you the vertices you care about:

gremlin> g.V().has('name','alice').
......1>   repeat(outE().has('tag','ruby').inV()).
......2>     times(3).
......3>     emit()
==>v[2]
==>v[4]
==>v[6]

The more complicated part comes after this where you are concerned about retrieving the path information for each so that you can grab the "value" properties along each "rates" edge. I chose to use group so that I could easily get the Map structure you wanted. Obviously, if "bobby" appeared twice in the tree you would end up with two lists of ratings for his Map entry.

If you pick apart what's happening in group() you can see that it is modulated by two by() options. The first corresponds to the key in the Map (obviously, i'm assuming uniqueness on "name"). The second extracts the path from the current traverser (the person vertex). Before going any further take a look at what the output looks like with just the path():

gremlin> g.V().has('name','alice').
......1>   repeat(outE().has('tag','ruby').inV()).
......2>     times(3).
......3>     emit().
......4>   group().
......5>     by('name').
......6>     by(path()).next()
==>bobby=[v[0], e[10][0-rates->2], v[2]]
==>cindy=[v[0], e[10][0-rates->2], v[2], e[11][2-rates->4], v[4]]
==>david=[v[0], e[10][0-rates->2], v[2], e[11][2-rates->4], v[4], e[12][4-rates->6], v[6]]

The steps that follow path() manipulate that path into the form you want. it unfolds each path then filters out the edges by looking for the edge only property of "value" and then extracts that and then folds the values back into a list for each value in the map.