Initial Situation
- Large Neo4j 3.4.6 graph with a tree-like structure (10 levels deep, 10 million nodes).
- Unexceptional all nodes are connected with each other. The nodes as well as the relationships are in each case of the same type.
- Exactly one central root node.
- Reduced and simplified example:
CREATE (Root:CustomType {name: 'Root'})
CREATE (NodeA:CustomType {name: 'NodeA'})
CREATE (NodeB:CustomType {name: 'NodeB'})
CREATE (NodeC:CustomType {name: 'NodeC'})
CREATE (NodeD:CustomType {name: 'NodeD'})
CREATE (NodeE:CustomType {name: 'NodeE'})
CREATE (NodeF:CustomType {name: 'NodeF'})
CREATE (NodeG:CustomType {name: 'NodeG'})
CREATE (NodeH:CustomType {name: 'NodeH'})
CREATE (NodeI:CustomType {name: 'NodeI'})
CREATE (NodeJ:CustomType {name: 'NodeJ'})
CREATE (NodeK:CustomType {name: 'NodeK'})
CREATE (NodeL:CustomType {name: 'NodeL'})
CREATE (NodeM:CustomType {name: 'NodeM'})
CREATE (NodeN:CustomType {name: 'NodeN'})
CREATE (NodeO:CustomType {name: 'NodeO'})
CREATE (NodeP:CustomType {name: 'NodeP'})
CREATE (NodeQ:CustomType {name: 'NodeQ'})
CREATE
(Root)-[:CONTAINS]->(NodeA),
(Root)-[:CONTAINS]->(NodeB),
(Root)-[:CONTAINS]->(NodeC),
(NodeA)-[:CONTAINS]->(NodeD),
(NodeA)-[:CONTAINS]->(NodeE),
(NodeA)-[:CONTAINS]->(NodeF),
(NodeE)-[:CONTAINS]->(NodeG),
(NodeE)-[:CONTAINS]->(NodeH),
(NodeF)-[:CONTAINS]->(NodeI),
(NodeF)-[:CONTAINS]->(NodeJ),
(NodeF)-[:CONTAINS]->(NodeK),
(NodeI)-[:CONTAINS]->(NodeL),
(NodeI)-[:CONTAINS]->(NodeM),
(NodeJ)-[:CONTAINS]->(NodeN),
(NodeK)-[:CONTAINS]->(NodeO),
(NodeK)-[:CONTAINS]->(NodeP),
(NodeM)-[:CONTAINS]->(NodeQ);
To be solved challenge
- By means of a MATCH-WITH-UNWIND Cypher query I’m successfully able to select a subtree and bind it to a path. Let’s say the subtree spans over the nodes A,E,F,I and J.
- Based on this path I need all leaves of the subtree, not the complete tree now.
.
MATCH
path = (:CustomType {name:'NodeA'})-[:CONTAINS*]->(:CustomType {name:'NodeJ'}) /* simplified */
WITH
nodes(path) as selectedPath
/* here: necessary magic to identify the leaf nodes of the subtree */
RETURN
leafNode;
- Among other things I tried to solve the requirement with a
WHERE NOT(node-->())
approach, but realized this works for leaves of the complete tree only. Unfortunately I was not able to convince theWHERE NOT(node-->())
clause to respect the selected subtree boundaries. - So, how can I find all leaves of a selected subgraph with Cypher and Neo4j? Can you please give me an advice how to solve this challenge? Many thanks in advance for pointing me into the right direction!