CosmosDB Graph : "upsert" query pattern

Question

I am new to Gremlin query language. I have to insert data on a Cosmos DB graph (using Gremlin.Net package), whether the Vertex (or Edge) already exists in the graph or not. If the data exists, I only need to update the properties. I wanted to use this kind of pattern:

g.V().hasLabel('event').has('id','1').tryNext().orElseGet {g.addV('event').has('id','1')}

But it is not supported by Gremlin.Net / Cosmos DB graph API. Is there a way to make a kind of upsert query in a single query ?

Thanks in advance.

stephen mallette stephen mallette · Accepted Answer · 2018-04-10T16:23:13

There are a number of ways to do this but I think that the TinkerPop community has generally settled on this approach:

g.V().has('event','id','1').
  fold().
  coalesce(unfold(),
           addV('event').property('id','1'))

Basically, it looks for the "event" with has() and uses fold() step to coerce to a list. The list will either be empty or have a Vertex in it. Then with coalesce(), it tries to unfold() the list and if it has a Vertex that is immediately returned otherwise, it does the addV().

If the idea is to update existing properties if the element is found, just add property() steps after the coalesce():

g.V().has('event','id','1').
  fold().
  coalesce(unfold(),
           addV('event').property('id','1')).
  property('description','This is an event')

If you need to know if the vertex returned was "new" or not then you could do something like this:

g.V().has('event','id','1').
  fold().
  coalesce(unfold().
           project('vertex','exists').
             by(identity()).
             by(constant(true)),
           addV('event').property('id','1').
           project('vertex','exists').
             by(identity()).
             by(constant(false)))

Additional reading on this topic can be found on this question: "Why do you need to fold/unfold using coalesce for a conditional insert?"

Also note that optional edge insertion is described here: "Add edge if not exist using gremlin".

As a final note, while this question was asked regarding CosmosDB, the answer generally applies to all TinkerPop-enabled graphs. Of course, how a graph optimizes this Gremlin is a separate question. If a graph has native upsert capabilities, that capability may or may not be used behind the scenes of this Gremlin so there may be better ways to implement upsert by way of the graphs systems native API (of course, choosing that path reduces the portability of your code).

CosmosDB Graph : "upsert" query pattern

1 Answers