Understanding different methods for adding edges with Gremlin-Python

Question

I'm trying to understand the differences in methods, and the best syntax for adding edges (between existing vertices) in Gremlin-Python.

Having read several posts here on SO, I've subdivided some different approaches I found into a few questions.

Many thanks for any feedback in advance!

1) What is the best order of adding properties to the edge, while creating it: which one of these would be the better option (in case there is any significant difference at all)?

g.V().property("prop1", "prop1_val").as_("a")
 .V().property("prop2", "prop2_val").as_("b")
 .addE("some_relationship")
# rest of traversal option 1:
 .property("prop1_val_weight", 0.1d)
 .from_("a").to("b")
# rest of traversal option 2:
 .from_("a").to("b")
 .property("prop1_val_weight", 0.1d)

2) What is the purpose, and correct usage, of " __.V() "?

g.V().property("prop1", "prop1_val")
 .as_("a").V().property("prop2", "prop2_val")
 .as_("b").addE("some_relationship")
 .property("prop1_val_weight", 0.1d)
# AND THEN:
 .from_("a").to("b")
# VERSUS: 
 .from_(__.V("a")).to(__.V("b"))

3) What are the differences between using "property" vs. "properties":

g.V().property("prop1", "prop1_val").as_("a")
# VERSUS:
g.V().properties("prop1", "prop1_val").as_("a")
# REST OF THE TRAVERSAL:
 .V().property("prop2", "prop2_val").as_("b")
 .addE("some_relationship")
 .property("prop1_val_weight", 0.1d)
 .from_("a").to("b")

4) What happens when there is no ".to()" vertex/vertices specified, and in this case, also using " __.V() " :

g.V().property("prop1", "prop1_val").as_("a")
 .V().property("prop2", "prop2_val").as_("b")
 .addE("some_relationship").to(__.V()
 .has("prop2", "prop2_val"))

5) What are the reasons for adding " .profile()" at the end of a traversal:

g.V('Alice').as_('v').V('Bob').coalesce(inE('spokeWith')
 .where(outV().as_('v')).addE('spokeWith')
 .property('date', 'xyz').from_('v'))
 .profile()

6) What is the correct usage, and in general the added advantage, of using the "coalesce" step while adding edges, like it's being used in the traversal at 5 ^^ ?

7) And a few general questions:

what is the advantage of also looking for the label, e.g. " g.V().has("LABEL1", "prop1", "prop1_val").as_("a") [etc.]"
after assigning a traversal to a variable (eg. " t = g.V() ... " in several steps, is it sufficient to then only once, at the end, call "t.iterate()" or should this be done each time?
at which point in a script should one call "tx.commit()": is calling it only once, at the end of several traversals, sufficient?

stephen mallette stephen mallette · Accepted Answer · 2019-08-12T11:11:31

1) What is the best order of adding properties to the edge, while creating it

I don't think there is a "best order" operationally speaking, but personally I think it reads better to see the from() and to() immediately follow addE().

2) What is the purpose, and correct usage, of " __.V() "?

You can't use a mid-traversal V() in that fashion. Doing so is basically saying "find me the vertex in the graph with the T.id of "a" which doesn't exist. "a" in your case is a step label that local to the scope of that traversal only.

3) What are the differences between using "property" vs. "properties":

There is a massive difference. property(k,v) is a mutation step which modifies the graph element in the stream with by adding/updating the property key with the specified value. properties(k...) gets the list of properties specified by the provided keys (i.e. a read operation).

4) What happens when there is no ".to()" vertex/vertices specified, and in this case, also using " __.V() "

Why not start up Gremlin Console and see:

gremlin> g.addV().addE('self')
==>e[17][16-self->16]
gremlin> g.addV().as('z').addE('self').property('x','y').from('z').to(V().has('person','name','nobody'))
The provided traverser does not map to a value: v[20]->[TinkerGraphStep(vertex,[~label.eq(person), name.eq(nobody)])]
Type ':help' or ':h' for help.
Display stack trace? [yN]

5) What are the reasons for adding " .profile()" at the end of a traversal:

The same reasons you would profile your code or explain a SQL query - to get more detailed understanding of what that code is doing. Maybe you need to see if an index is being used properly or figure out what step is taking the longest to execute to see if you can better optimize your traversal.

6) What is the correct usage, and in general the added advantage, of using the "coalesce" step while adding edges, like it's being used in the traversal at 5 ^^ ?

Unless I'm misreading it, I don't see any reason to use coalesce() as it is used in 5 as there is no second argument to it. You typically use coalesce() in the context of adding vertices/edges when you want to "get or create" or "upsert" an element - you can read more about that here.

what is the advantage of also looking for the label, e.g. " g.V().has("LABEL1", "prop1", "prop1_val").as_("a") [etc.]"

It's best to include the vertex label when doing a has() as you then explicitly namespace the key you are searching on. You used "prop1" in your example - if "prop1" was not a globally unique key and you only wanted "LABEL1" vertices but "LABEL2" vertices also had "prop1" values then you might not be getting what you want. I suppose that if you use globally unique key names then this isn't a problem but I think that if you want to maximize the portability of your Gremlin you might want to get into the practice as I think there are some graph systems out there that require the specification of the label.

after assigning a traversal to a variable (eg. " t = g.V() ... " in several steps, is it sufficient to then only once, at the end, call "t.iterate()" or should this be done each time?

iterate() is a terminating step. You typically call it to generate any side-effects that the traversal may generate. Once "iterated", you really can't call it again as it will have no effect as iteration is a one way operation and once the traverser objects are exhausted there is nothing left to iterate() again.

at which point in a script should one call "tx.commit()": is calling it only once, at the end of several traversals, sufficient?

If you are following best practices you aren't calling commit() at all. Gremlin Server manages your transactions for you. One request (i.e. traversal) is basically one transaction. The only time you ever need to call commit() is if you choose to manage transactions yourself in the context of a session or a really long run script. If you are building a new application you should avoid either of those options and simply utilize bytecode-based traversals.

Understanding different methods for adding edges with Gremlin-Python

1 Answers