1
votes

Say I have have matched a start node and now have its id.

Next I want to find all the nodes along incoming paths to my start node that have a certain label. I don't know how "distant" nodes with that label are (but think they will be less than 20 nodes away). If there are any related nodes with that label, they will all be the same distance away.

At the moment I'm doing:

MATCH (l:`mylabel`)-[*1..20]->(start) where id(start) = 1016236 RETURN l

But I'm guessing this could potentially be wasteful and inefficient and slow.

I could do something like (in psuedo code):

for $i in (1..20) {
    @ls = run_cypher("MATCH (l:`mylabel`)-[*1..$i]->(start) where id(start) = 1016236 RETURN l")
    last if @ls
}

But that needs multiple separate queries and is still wasteful.

Is there a better solution?

1
Are you saying that if any path has node(s) with that label, then all paths will have node(s) with that label, and in exactly the same positions? - cybersam
I was wrong (see comments on Brians answer) - sbs
Actually, if there are multiple depths, I only want nodes at the lowest depth - sbs
Can you share more about the model that you are querying? Unless your graph is very unusual (or unprecedentedly large) it's unlikely that you will get anything useful out of a depth-20 traversal. You are almost certainly traversing the entire graph. - Ben Butler-Cole
I'm expecting long linear chains. I have (non-literal) parent->child relationships with the possibility of many "generations", and my starting point would be an unknown generation and I want to walk the tree from some child up to a certain ancestor. Unfortunately things are more complex and there isn't always just a single path to do that. - sbs

1 Answers

1
votes

If they're all going to be at the same depth you might do a query to find with a LIMIT 1 and then query at that depth:

MATCH path=shortestPath((l:`mylabel`)-[*1..20]->(start))
WHERE id(start) = 1016236
RETURN min(length(rels(path))) AS min_depth

and then you could use that depth to query like this:

MATCH path=(l:`mylabel`)-[*$i..$i]->(start)
WHERE id(start) = 1016236
RETURN l

I'd be curious to know if that performs better. Honestly since Neo4j is optimized for graph traversals it might not make a big difference, but it depends on your graph. How well is it performing now?