2
votes

This (newbe-)question is based on the database provided in the official getting started with neo4j online-tutorial.

My goal is to create a query that lists all people which are connected to more than two movies. The database consits of nodes of type person and movie. Nodes are connected by relationships such as 'ACTED_IN', 'DIRECTED', 'WROTE' or 'PRODUCED'.

The DB states that Gene Hackman acted in three different movies while Cameron Crowe has directed, produced and written a single movie. So Cameron has three relationships to a single movie while Gene is connected to three different movies. More nodes and connections exist.

My current queries goes as follows:

match (p:Person)-[r]->(m:Movie) with p, count(r) as rel where rel > 2 return p;

This will return:

  • Gene Hackman
  • Tom Cruise
  • Cameron Crowe

Gene and Tom have each played in three different movies, so this is correct. As stated above all relationships of Cameron go to the same movie, which is not my intention. The query should not return Cameron in this list but only the first two people.

Clearly, I could just go by the 'ACTED_IN' relationship but I'd also like to list people that are not actors, e.g. an author that wrote three or more seperate movies.

Another query I though about is:

match (m:Movie)<-[ra]-(p:Person)-[rb]->(b:Movie) ...

Unfortunately this limits the number of connections to exactly two.

Is it possible to adapt the first query in a way that only different movies are included when counting the relationships between people and movies?

2

2 Answers

0
votes

Here's a query that can do what you are looking for.

MATCH (p:Person)--(m:Movie)
WITH p, collect(DISTINCT m) AS ms
WHERE LENGTH(ms) > 2
RETURN p

You don't really care about the relationship type, so don't worry about it. You collect the distinct movies for each person and return the person if the number of distinct movies is greater than your threshold.

Grace and peace,

Jim

1
votes

Try this query:

match (p:Person)-[r]->(m:Movie)
with p, count(r) as rel, count(distinct(endNode(r))) as q
where rel > 2 and q = 1 return p;

What it does is extract the ending node from the relationship. What you want is that there only be one such movie, to get rid of the Cameron Crowe problem.