2
votes

I have a large graph where some of the relationships have properties that I want to use to effectively prune the graph as I create a subgraph. For example, if I have a property called 'relevance score' and I want to start at one node and sprawl out, collecting all nodes and relationships but pruning wherever a relationship has the above property.

My attempt to do so netted this query:

start n=node(15) match (n)-[r*]->(x) WHERE NOT HAS(r.relevance_score) return x, r

My attempt has two issues I cannot resolve:

1) Reflecting I believe this will not result in a pruned graph but rather a collection of disjoint graphs. Additionally:

2) I am getting the following error from what looks to be a correctly formed cypher query:

Type mismatch: expected Any, Map, Node or Relationship but was Collection<Relationship> (line 1, column 52 (offset: 51))
"start n=node(15) match (n)-[r*]->(x) WHERE NOT HAS(r.relevance_score) return x, r"
2
Which version of Neo4j are you using? START is considered deprecated and has is no longer supported in 3.x.Gabor Szarnyas
3.0.6. It still works though for other queries, but good to know. I missed that point in the docsWildBill

2 Answers

3
votes

You should be able to use the ALL() function on the collection of relationships to enforce that for all relationships in the path, the property in question is null.

Using Gabor's sample graph, this query should work.

MATCH p = (n {name: 'n1'})-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance_score is null)
RETURN p
2
votes

One solution that I can think of is to go through all relationships (with rs*), filter the the ones without the relevance_score property and see if the rs "path" is still the same. (I quoted "path" as technically it is not a Neo4j path).

I created a small example graph:

CREATE
  (n1:Node {name: 'n1'}),
  (n2:Node {name: 'n2'}),
  (n3:Node {name: 'n3'}),
  (n4:Node {name: 'n4'}),
  (n5:Node {name: 'n5'}),
  (n1)-[:REL {relevance_score: 0.5}]->(n2)-[:REL]->(n3),
  (n1)-[:REL]->(n4)-[:REL]->(n5)

The graph contains a single relevant edge, between nodes n1 and n2.

enter image description here

The query (note that I used {name: 'n1'} to get the start node, you might use START node=...):

MATCH (n {name: 'n1'})-[rs1*]->(x)
UNWIND rs1 AS r
WITH n, rs1, x, r
WHERE NOT exists(r.relevance_score)
WITH n, rs1, x, collect(r) AS rs2
WHERE rs1 = rs2
RETURN n, x

The results:

╒══════════╤══════════╕
│nx         │
╞══════════╪══════════╡
│{name: n1}│{name: n4}│
├──────────┼──────────┤
│{name: n1}│{name: n5}│
└──────────┴──────────┘

Update: see InverseFalcon's answer for a simpler solution.