Gremlin iterative conditional traversal

Question

I have a graph with the following structure:

Some vertices represent real-world items and some type, i.e. there is a vertex for "city" and vertices for specific cities like "London" or "Seattle". Each vertex can have 'is-a' edge to its type vertex, i.e. "London" -(is-a)-> "city", "USA" -(is-a)-> "country".
Vertices can be also linked by "in" relationship, i.e. "London" -(in)-> "UK", "Seattle" -(in)->"Washington".
Some vertices may also have "in-country" relationship, i.e. "Seattle"->(in-country)->"USA", but some may not.
It is possible to have multiple links (i.e. some city can be disputed between two countries and so have two "in-country" or "in" links) - in this case multiple countries should be returned.

The task is for each vertex to try and find the country in which it resides (of course, it's meaningless for generic vertices like "city" but in this case it should just produce null). So I tried to do something like this:

v.as('loopstep').ifThenElse{it.out('is-a').has('ID', 'country').hasNext()}{
 it
}{
 it.ifThenElse{it.out('in-country').hasNext()}{
    it.out('in-country')
 }{
    it.out('in').loop('loopstep'){it.loops < 10 }
 }
}

but that produces NPE on loop, e.g.:

java.lang.NullPointerException
    at com.tinkerpop.pipes.branch.LoopPipe.getLoops(LoopPipe.java:75)

etc. It looks like the loop can not see the "loopstep" label. Am I doing it wrong? What would be the right way to write such traversal query?

guess the second ifThenElse should start with an it not if. Some code that creats a graph according to your description would be nice. — Faber
i'm not sure i understand completely. first, is your starting vertex (i.e. v) a city? second, what is the point of the looping construct given your schema? all city vertices link to countries via one or more edges of label "in" or "in-country" - so where is the need to loop? maybe i'm misunderstanding something about your schema? — stephen mallette
@stephenmallette it can be anything - city, state, country, town, village, county, etc. - any object that can be located in another object. The point of the loop is to follow the "in" relationship recursively until we get to a country - to which we can get in two ways - either as chain of "in"s ending in a country or chain of "in"s and then "in-country" (we assume the target of "in-country" is always a country). But a village may not have a direct link to a country - it may go to metropolis, then county, then state, etc. — StasM

stephen mallette stephen mallette · Accepted Answer · 2014-12-12T00:32:10

I don't think you need all the ifThenElse stuff. Assuming I now have your model right, I think you just need this:

gremlin> g = new TinkerGraph()                                                                     
==>tinkergraph[vertices:0 edges:0]
gremlin> g = new TinkerGraph()                                                                     
==>tinkergraph[vertices:0 edges:0]
gremlin> usa = g.addVertex([name:"USA"])
==>v[0]
gremlin> va = g.addVertex([name:"VA"])    
==>v[1]
gremlin> fairfax = g.addVertex([name:"Fairfax"])
==>v[2]
gremlin> country = g.addVertex([ID:"country"])
==>v[3]
gremlin> state = g.addVertex([ID:"state"])
==>v[4]
gremlin> city = g.addVertex([ID:"city"])
==>v[5]
gremlin> g.addEdge(null, fairfax, va, "in")
==>e[6][2-in->1]
gremlin> g.addEdge(null, fairfax, city, "is-a")
==>e[7][2-is-a->5]
gremlin> g.addEdge(null, va, usa, "in")    
==>e[8][1-in->0]
gremlin> g.addEdge(null, va, state, "is-a")
==>e[9][1-is-a->4]
gremlin> g.addEdge(null, fairfax, usa, "in-country")
==>e[10][2-in-country->0]
gremlin> g.addEdge(null, usa, country, "is-a")
==>e[11][0-is-a->3]
gremlin> fairfax.as('x').out('in','in-country').loop('x'){it.loops<10 && it.object.out('is-a').ID.next()!='country'}.dedup.name
==>USA

Picking that last line apart you get, traverse out from "city" (i.e. fairfax) using "in" or "in-country" labels. Obviously, if I'm lucky I get to traverse on "in-country" and I'm done. I then loop back to the x placeholder, doing that while I still have less than 10 loops and while the current vertex I'm at isn't a country - otherwise I break out of the loop because I've reached the country vertex I want to emit. I dedup because your schema allows for multiple ways to get to a country given "in-country" and "in". You might yet need some error handling or something in here depending on your dataset, but I think this should inspire you to come up with the ultimate solution.

Gremlin iterative conditional traversal

1 Answers