4
votes

I am new to Cypher and Neo4j (and Stack Overflow for that matter), so hopefully this question has an easy answer. I have spent hours reading the documentation and Googling, but found nothing exactly on point to answer this question.

Apparently I need more reputation to post images, so I will do my best with text.

I can find the path from one node to another like such:

    MATCH path = (a {name:"a"})-[*]-(x {name:"x"})
    RETURN extract(r IN nodes(path) | r);

Which could return something like the following two paths:

    (a)-[:RED]->(b)<-[:BLUE]-(c)-[:RED]->(f)<-[:RED]-(g)-[:BLUE]->(h)<-[:RED]-(x)

    (z)-[:RED]->(h)<-[:RED]-(x)

So far so good. There are, of course, lots of other relationships and nodes in the database that connect to these nodes, but this is the right path.

I need to find the one node along the path that has two RED relationships coming into it like so:

    -[:RED]->(findme)<-[:RED]-

In the two path examples above, findme = (f) and (h).

Note: There will be lots of matches in the database, but I only want the one in the path (should only be one). Also, there could be many nodes and different relationships in the path, or as few as only 3 nodes each connected by the RED relationship.

I tried matching using:

    MATCH (a)-[*]-(b)-[:RED]->(findme)<-[:RED]-(c)-[*]-(x) RETURN findme;

which works as long as there are at least 5 nodes, but won't work for 3 nodes.

How can I find this pattern match within a path?

3
Do the 'red' relationships have to be part of the path, or is it sufficient that the 'findme' node is on the path, with 'red' relationships incoming from nodes that are outside the path?jjaderberg
Sidenote: the extraction you do in your return clause seems extraneous. Unless you're doing something more to the nodes than pulling them out of the path, you can just RETURN nodes(path) directly.jjaderberg
@jjaderberg, yes the 'red' nodes must be on the path. Sorry about not clarifying that point. Also, thanks for the tip about RETURN nodes(path). I didn't know that command would also return the relationships. Very nice!Brad Stone

3 Answers

1
votes

How about this:

MATCH p=(a {name:"a"})-[*]-(x {name:"x"})
WITH nodes(p) AS nodes
UNWIND nodes AS n
WITH n WHERE exists(()-[:RED]->(n)<-[:RED]-())
RETURN n
  1. MATCH on the path from a to x.
  2. Get all nodes in the path
  3. For each node in the path, filter on nodes where we have two incoming RED relationships
1
votes

You were close! As you already know, the * signifies variable length/depth. You can use that sign by itself, like you do in your queries, or you can specify a range. Without a specified range, the * means "one or more". If you apply your last query

MATCH (a)-[*]-(b)-[:RED]->(findme)<-[:RED]-(c)-[*]-(x)
WHERE a.name = "a" AND x.name = "x"
RETURN findme

to the path with three nodes

(a)-[:RED]->(findme)<-[:RED]-(x)

the two relationships in the path could have been matched by the non-variable part of the query. But then there are no relationships left for the variable part to match. The parts (a)-[*]-(b) and (c)-[*]-(x) each require "one or more", but there are zero relationships left to match.

You can change the meaning of * to "zero or more" by explicitly specifying that as a range. In this way the two variable parts of the query will match also when there are no relationships left in the path.

MATCH (a)-[*0..]-(b)-[:RED]->(findme)<-[:RED]-(c)-[*0..]-(x)
WHERE a.name = "a" AND x.name = "x"
RETURN findme

This query will match paths consisting of just the "red triple" that you are looking for, as well as paths that have relationships extending on one or both sides of the triple.

Worth noting is that a pattern with a "zero or more" range matching for the "zero case" will have two identifiers in the pattern, but only one node. The node is therefore bound to both identifiers. For the identifiers in the left hand side of the pattern above, when it matches the three-node path, saying

node a and node b connected by a zero-depth relationship

is really a silly way of saying

a and b are the same node

You can check out this console that has a five-node path between (a {name: "a"}) and (x {name: "x"}) and a three-node path between (a {name: "a2"}) and (x {name: "x2"}). The same pattern will match both of these paths (just change the values for name in the WHERE clause). You will get five columns back for both paths, because the query returns five identifiers. In the case of the three-node path, the first result column will be identical to the second, and the fourth to the fifth––because "node a and node b connected by a zero-depth relationship" just means "a and b are the same node".

1
votes

I love when there are multiple answers that can do the job :)

Here is mine :

MATCH path=(a {name:"a"})-[*]-(x {name:"x"})
RETURN filter(x in nodes(p) WHERE size((x)<-[:RED]-()) = 2) as n

+1 to Will and Jonathan