Like the question says. The word "first" is important — there may be more relationships that match the same condition.
Real-world use case: every relationship has a timestamp property, and I want to find the first one that happened before a certain time (e.g. "before noon"). E.g.:
(head) -[time: 9]-> () -[time: 8]-> () -[time: 7]-> ...
—
Here's what I had (assume we know what the head
node is):
MATCH (head) -[prevs:next*0..]-> (x) -[rel:next]-> (y)
WHERE NONE(prev IN prevs WHERE prev.time < {time})
AND rel.time < {time}
RETURN x, rel, y
That says, "traverse one or more relationships until we find one before {time}
, AND none of the previous relationships were before {time}
."
The query works, but surprisingly, it continues to traverse the list even after it finds one match. More precisely, it keeps expanding the variable length match — even though the NONE()
check will clearly fail for the rest.
Maybe this is just a Cypher optimization that has yet to come? In the meantime, is there a more efficient way I can query this? (IOW, is there any way I can achieve a "short-circuit" after the first match?)
—
Here's a console link to play around with:
http://console.neo4j.org/r/b4v2tl
Important: the setup creates a 1001-node linked list, so it may freeze your browser/tab for a minute or so. I recommend immediately disabling "Toggle Viz" when it unfreezes.
This console example also reverses the time-ordering from the example above, just for simplicity. So paste this query in:
MATCH (head:Node {id: 0}) -[prevs:next*0..]-> (x) -[rel:next]-> (y)
WHERE NONE(prev IN prevs WHERE prev.time > 5)
AND rel.time > 5
RETURN x, rel, y
That's querying for what should be the fifth relationship in the list.
You'll see Neo4j refuse to execute the query. If you change 0..
to e.g. 0..10
, it'll work. Keep bumping up 10
, and you'll see it get slower and slower. The guard kicks in by 100
.