5
votes

Like the question says. The word "first" is important — there may be more relationships that match the same condition.

Real-world use case: every relationship has a timestamp property, and I want to find the first one that happened before a certain time (e.g. "before noon"). E.g.:

(head) -[time: 9]-> () -[time: 8]-> () -[time: 7]-> ...

Here's what I had (assume we know what the head node is):

MATCH (head) -[prevs:next*0..]-> (x) -[rel:next]-> (y)
WHERE NONE(prev IN prevs WHERE prev.time < {time})
AND rel.time < {time}
RETURN x, rel, y

That says, "traverse one or more relationships until we find one before {time}, AND none of the previous relationships were before {time}."

The query works, but surprisingly, it continues to traverse the list even after it finds one match. More precisely, it keeps expanding the variable length match — even though the NONE() check will clearly fail for the rest.

Maybe this is just a Cypher optimization that has yet to come? In the meantime, is there a more efficient way I can query this? (IOW, is there any way I can achieve a "short-circuit" after the first match?)

Here's a console link to play around with:

http://console.neo4j.org/r/b4v2tl

Important: the setup creates a 1001-node linked list, so it may freeze your browser/tab for a minute or so. I recommend immediately disabling "Toggle Viz" when it unfreezes.

This console example also reverses the time-ordering from the example above, just for simplicity. So paste this query in:

MATCH (head:Node {id: 0}) -[prevs:next*0..]-> (x) -[rel:next]-> (y)
WHERE NONE(prev IN prevs WHERE prev.time > 5)
AND rel.time > 5
RETURN x, rel, y

That's querying for what should be the fifth relationship in the list.

You'll see Neo4j refuse to execute the query. If you change 0.. to e.g. 0..10, it'll work. Keep bumping up 10, and you'll see it get slower and slower. The guard kicks in by 100.

3
I'd report this in github issues so it gets tracked by the dev team--if they don't consider it a bug, at least as a feature request.Eve Freeman

3 Answers

0
votes

Try this:

MATCH (x)-[r:next]->(y) 
WHERE r.time > {time} 
RETURN x, r, y
ORDER BY r.time
LIMIT 1

EDIT

If you have your relationships indexed on time property then

START r=relationship:rels(time = {time})
MATCH (x)-[r1:next]->(y)-[r]->()
RETURN x,r1,y
0
votes

This is a shortcoming of cypher before 2.1 (which will hopefully address it).

If it is urgent, check out a unmanaged extension that does this in a few lines of java:

https://github.com/jexp/neo4j-activity-stream

https://github.com/jexp/neo4j-activity-stream/blob/master/src/main/java/org/neo4j/example/activity/ActivityStream.java#L59

0
votes

I believe this was addressed last year in Neo4j 3.0.3 as seen in this commit - https://github.com/neo4j/neo4j/commit/2c5c3dd