I want to find nodes who should be linked to a given node, where the link is defined by some logic, which uses the nodes' and existing edges' attribute with the following logic:
A) (The pair has the same zip (node attribute) and name_similarity (edge attribute) > 0.3 OR
B) The pair has a different zip and name_similarity > 0.5 OR
C) The pair has an edge type "external_info" with value = "connect")
D) AND (the pair doesn't have an edge type with "external info" with value = "disconnect")
In short: (A | B | C) & (~D)
I'm still a newbie to gremlin, so I'm not sure how I can combine several conditions on edges and nodes.
Below is the code for creating the graph, as well as the expected results for that graph:
# creating nodes
(g.addV('person').property('name', 'A').property('zip', '123').
addV('person').property('name', 'B').property('zip', '123').
addV('person').property('name', 'C').property('zip', '456').
addV('person').property('name', 'D').property('zip', '456').
addV('person').property('name', 'E').property('zip', '123').
addV('person').property('name', 'F').property('zip', '999').iterate())
node1 = g.V().has('name', 'A').next()
node2 = g.V().has('name', 'B').next()
node3 = g.V().has('name', 'C').next()
node4 = g.V().has('name', 'D').next()
node5 = g.V().has('name', 'E').next()
node6 = g.V().has('name', 'F').next()
# creating name similarity edges
g.V(node1).addE('name_similarity').from_(node1).to(node2).property('score', 1).next() # over threshold
g.V(node1).addE('name_similarity').from_(node1).to(node3).property('score', 0.2).next() # under threshold
g.V(node1).addE('name_similarity').from_(node1).to(node4).property('score', 0.4).next() # over threshold
g.V(node1).addE('name_similarity').from_(node1).to(node5).property('score', 1).next() # over threshold
g.V(node1).addE('name_similarity').from_(node1).to(node6).property('score', 0).next() # under threshold
# creating external output edges
g.V(node1).addE('external_info').from_(node1).to(node5).property('decision', 'connect').next()
g.V(node1).addE('external_info').from_(node1).to(node6).property('decision', 'disconnect').next()
The expected output - for input node A - are nodes B (due to condition A), D (due to Condition B), and F (due to condition C). node E should not be linked due to condition D.
I'm looking for a Gremlin query that will retrieve these results.