I'm experiencing issues querying a large graph involving repeat steps that aim at making "hops" across vertices and edges. My intention is to infer indirect relationships between objects. Consider the following:
John--livesIn-->Paris
Paris--isIn-->France
What I expect to come up with is that John is based in France. Simple enough, and this works great with a small data set.
The query that I use is the following, where I make no more than 2 hops:
g.V().has('name','John')
.emit(loops().is(lt(2)))
.repeat(__.bothE().bothV().simplePath())
.inE('isIn').outV().path()
This is working as expected, until I apply this to a graph made of about 1000 vertices and 3000 edges. Then, after a few minutes, I get various kinds of error (over the REST API) with no clear logic:
- Error: Error encountered evaluating script
- Error: 504 Gateway Time-out
- Error: Java heap space
- Error
I suspect that I am doing something wrong in my query. For exemple, setting the number of "hops" to 1 (direct relationship) with .emit(loops().is(lt(1)))
, I would expect the results to be delivered swiftly since it would not go into the repeat loop. However, this triggers the same issue.
Many thanks for your help!
Olivier