0
votes

I have a graph in Neo4J that looks like this:

(a {flag:any})<- (many, 0 or more) <-(b {flag:true})<- (many, 0 or more) <-(c {flag: any})
-OR-
(a {flag:any})<- (many, 0 or more) <-(d)
-OR-
(a {flag:any})

Where a, b, c, and d all have the same type, and the relations are also the same. All the nodes have flag:false except where noted. Of course the real graph is a tree, not a vine.

In short, every path should begin with a and end with the first flag=true node, or should begin with a and get all children down to the leaf of the tree. Per the last example, a doesn't have to have any children - it can be a root and a leaf. Finally, in the first case, we'll never pull in c. b stops the traversal.

How can I write this query?

I have gotten it to work with a path and several unwind/collect statements that are basically horse****, lol. I want a better query, but I am so confused now it is not going to happen.

2

2 Answers

1
votes

I would approach this query as the UNION of the two cases:

MATCH shortestPath((a)<-[:REL_TYPE*1..]-(end:Label {flag: true}))
RETURN a, end
UNION
MATCH (a)<-[:REL_TYPE*0..]-(end:Label)
WHERE NOT (end)<-[:REL_TYPE]-()
RETURN a, end

Let's break it down:

  1. To express that we only want to traverse until the first flag is true, we use shortestPath.

  2. To express that we want to traverse down to the leaf, we use the following formalisation: a node is a leaf if it has no relationships that could be continued, captured by a WHERE NOT filter on patterns.

This should give an idea of the basic ideas to use for such queries -- please provide some feedback so that I can refine the answer.

1
votes

The following query should return all 3 kinds of paths. I assume that all relevant nodes are labeled Foo, and all relevant relationships have the BAR type.

The first term of the WHERE clause looks for paths (of length 0 or more, because of the variable-length relationship pattern used in the MATCH clasue) that end in a node with a true flag with no true flags earlier in the path (except for possibly the starting node). The second term looks for paths (of length 0 or more) ending with a leaf node, where no nodes (except for possibly the starting node) have a true flag.

MATCH p=(a:Foo)<-[:BAR*0..]-(b:Foo)
WHERE
  (b.flag AND NONE(x IN NODES(p)[1..-1] WHERE x.flag)) OR
  ((NOT (b)<-[:BAR]-()) AND NONE(y IN NODES(p)[1..] WHERE y.flag))
RETURN p;

NOTE: Variable-length relationship patterns with no upper bound (like [:BAR*0..]) can be very expensive, and can take a very long time or cause an out of memory error. So, you may need to specify a reasonable upper bound (for example, [:BAR*0..5]).