0
votes

Say I have multiple tree:

A<-{D, E}<-F
B<-{E, G}
C<-{E, H}
//Where only A, B, and C are of (:parent{name:""})
//There rest is child

Given a set of children nodes:

{E, F} //(:child{name:""})
//Clearly A is the most connected parent even if F is not directly connected to A

Question: How can I find the most connected parent node given the children nodes collection? Any cypher query, plugin function or procedure is welcomed. HELP.

Here's what I have tried but with no luck because it count the total relationship between two nodes:

MATCH (c:child)--(p:parent)
WHERE c.name IN ['E', 'F']
RETURN p ORDER BY size( (p)--(c) ) DESC LIMIT 1
//Also tried size( (p)--() ) but it count all relationship that the parent node has.
2

2 Answers

2
votes

The concept you're missing is variable-length relationship patterns. With this you can match from the :child nodes you need to :parent nodes at a variable distance, then count the times the parent nodes occur and take the top:

MATCH (c:child)-[*]->(p:parent) // assumes only incoming rels toward :parent
WHERE c.name IN ['E', 'F'] // make sure you have an index on :child(name)
WITH p, count(p) as connected
RETURN p 
ORDER BY connected DESC 
LIMIT 1
0
votes

Alright, so I tried something else, but not sure if it is efficient to work on huge graph (says 2M nodes+):

MATCH path= shortestPath( (c:child)--(p:parent) )
WHERE c.name IN [...]
WITH p, collect(path) as cnt
RETURN p, size(cnt) AS nchild
ORDER BY nchild DESC LIMIT 1

Any opinion on this?