2
votes

I have a graph database that contains highly connected nodes (hubs). These nodes can have more than 40000 relationships.

When I want to traverse the graph starting from a node, I would like to stop traversal at these hubs not to retrieve too many nodes.

I think I should use aggregation function and conditional stop based on the count of relationship for each node, but I didn't manage to write the good cypher query.

I tried:

MATCH p=(n)-[r*..10]-(m)
WHERE n.name='MyNodeName' AND ALL (x IN nodes(p) WHERE count(x) < 10)
RETURN p;

and also:

MATCH (n)-[r*..10]-(m) WHERE n.name='MyNodeName' AND COUNT(r) < 10 RETURN p;
2

2 Answers

2
votes

I think you can't stop the query at some node if you MATCH a path of length 10. You could count the number of relationships for all nodes in the path, but only after the path is matched.

You could solve this by adding an additional label to the hub nodes and filter that in your query:

MATCH (a:YourLabel)
OPTIONAL MATCH (a)-[r]-()
WITH a, count(r) as count_rels
CASE
WHEN count_rels > 20000
THEN SET a :Hub
END

Your query:

MATCH p=(n)-[r*..10]-(m)
WHERE n.name='MyNodeName' AND NONE (x IN nodes(p) WHERE x:Hub)
RETURN p

I used this approach in a similar case.

1
votes

Since Neo4j 2.2 there is a cool trick to use the internal getDegree() function to determine if a node is a dense node.

You also forgot the label (and probably index) for n

For your case that would mean:

MATCH p=(n:Label)-[r*..10]-(m)
WHERE n.name='MyNodeName'  AND size((m)--()) < 10
RETURN p;