1
votes

I am working on a query on the movie database to check the shortest paths between n nodes. In this simplified example, we want all the shortest paths between 2 movies:

match p=allShortestPaths((n)-[*]-(m)) where id(n) = 87 and id(m) = 121 return p;

all

Now I want to have all the shortest paths that don't include Keanu Reeves in it. I tried this:

match p=allShortestPaths((n)-[*]-(m)) where id(n) = 87 and id(m) = 121 and NONE(n in nodes(p) where n.name = "Keanu Reeves") return p;

This however takes an eternity to load, even after I have indexed the name field of Person...

Then In tried the following:

match p=allShortestPaths((n)-[*]-(m)) where id(n) = 87 and id(m) = 121 with p WHERE NONE(n in nodes(p) where n.name = "Keanu Reeves") return p;

This however gives me no results. I misinterpreted this by thinking it would just return those paths which don't have Keanu Reeves in between:

(no changes, no records)

I tested if I could get only those with Keanu Reeves in between with the any() function. This works perfectly:

match p=allShortestPaths((n)-[*]-(m)) where id(n) = 87 and id(m) = 121 with p WHERE ANY(n in nodes(p) where n.name = "Keanu Reeves") return p;

any()

What is the best approach to tackle this? I must say that my production query is way more complex than this, but it all boils down to this problem. It has to be a performant solution.

1

1 Answers

2
votes

The problem is that if one of the nodes from the path does not have the property name, then the entire check will not be passed.

So or do we check for the existence of a property:

match p=allShortestPaths((n)-[*]-(m)) 
where 
    n.title = 'The Replacements' and 
    m.title = 'Speed Racer' and
    NONE(n in nodes(p) where EXISTS(n.name) and n.name = "Keanu Reeves")
return p

Or use the COALESCE function:

match p=allShortestPaths((n)-[*]-(m)) 
where 
    n.title = 'The Replacements' and 
    m.title = 'Speed Racer' and
    NONE(n in nodes(p) where COALESCE(n.name, '') = "Keanu Reeves")
return p