1
votes

I am using the "Movies" database in Neo4j to simplify my question (type :play movies in the query box of an empty sandbox). For a list of 3 actors that I specify, I want to determine the total number of movies they've worked on, the number of movies they've acted in, and the number of movies they've directed (if any). Here is what I came up with:

MATCH (p:Person)-->(m:Movie)
WITH p, m, count(m) AS total
MATCH (p)-[:ACTED_IN]->(m)
WITH p, m, total, count(DISTINCT m) AS actedIn
MATCH (p)-[:DIRECTED]->(m)
WITH p, m, total, actedIn, count(DISTINCT m) AS directed
UNWIND ["Tom Hanks", "Clint Eastwood", "Charlize Theron"] AS actors
RETURN DISTINCT actors, total, actedIn, directed

Currently, it is retuning that each actor acted in 1 movie and directed 1 movie, which is incorrect. I need to keep the WITH clauses in the query and I need to define the list of actors.

In the real query I am working on that compares to this simpler one, the same thing is happening where each element of the list I defined returns the same numbers as the other elements in the list. I am not sure what I am doing wrong here.

1
Dave Bennet's suggested query is good, that should get you the counts you need. As for an explanation of why you were getting counts of 1, aggregation in Cypher is implicit, and the non-aggregation variables make up the grouping key, which is the context for what you are aggregating over. Since m is in scope, when you do WITH p, m, count(m), that means for the individual p and m nodes on that row (a single m node), it will get the count, which is always 1. If you remove m from the WITH, and use WITH p, count(m) as total, then the count is per the single person node on that row. - InverseFalcon

1 Answers

1
votes

I think this query will work for you.

Since every person has been involved in a movie in some capacity the first MATCH can asser that and then the subsequent ones can be optional.

// Find the people that worked in total movies controlled by your list
MATCH (p:Person)-->(m:Movie)
WHERE p.name IN ["Tom Hanks", "Clint Eastwood", "Charlize Theron"]

// carry the people and the total movies per person
WITH p, count(m) AS total

// find the movies those people acted in
OPTIONAL MATCH (p)-[:ACTED_IN]->(m:Movie)

// carry the people, total movies and the movies acted in
WITH p, total, count(m) AS actedIn

// find the movies they directed
OPTIONAL MATCH (p)-[:DIRECTED]->(m:Movie)
RETURN p.name, total, actedIn, count(m) AS directed