0
votes

I am trying to follow a video tutorial about neo4j that uses the Movies graph database found by default in neo4j.

In this tutorial there is an assignment asking to retrieve actors who played in most of the movies, ordering by count DESC and limit to 5.

The tutorial solution doesn't match with my result and there is something I don't understand. My result get duplicates movies while I'm using a similar Cypher request.

Tutorial solution:

MATCH (actor:Person)-[:ACTED_IN]-() RETURN actor.name,
COUNT(*) as COUNT ORDER BY COUNT DESC LIMIT 5;

enter image description here

In my solution I get duplicates:

MATCH (actor:Person)-[:ACTED_IN]-(movie:Movie) RETURN actor.name,
COLLECT(movie.title), COUNT(*) as COUNT ORDER BY COUNT DESC LIMIT 5; enter image description here

"Meg Ryan" ["Top Gun", "You've Got Mail", "Sleepless in Seattle", "Joe Versus the Volcano", "When Harry Met Sally", "Top Gun", "You've Got Mail", "Sleepless in Seattle", "Joe Versus the Volcano", "When Harry Met Sally", "Top Gun", "You've Got Mail", "Sleepless in Seattle", "Joe Versus the Volcano", "When Harry Met Sally"]

When I use :

MATCH (actor:Person)-[:ACTED_IN]-(movie:Movie) RETURN actor.name,
COLLECT(DISTINCT movie.title), COUNT(*) as COUNT ORDER BY COUNT DESC LIMIT 5;

I got the same movies as the tutorials solutions but the COUNT column is still showing a duplicate movies COUNT. enter image description here

2

2 Answers

0
votes

I had similar issues with the Tutorial - it is very easy to add the same nodes multiple times with Cypher CREATE statements. Maybe this has happened to you too?

Maybe just run:

MATCH(n) return n;

and then eyeball the whole graph - the tutorial should be small enough to do this & you'll see if you have duplicates.

0
votes

You can use distinct inside count:

MATCH (actor:Person)-[:ACTED_IN]-(movie:Movie) 
RETURN 
    actor.name,
    COLLECT(DISTINCT movie.title), 
    COUNT(DISTINCT movie) as COUNT 
ORDER BY COUNT DESC LIMIT 5;

P.S. In this case result of COUNT(*) is count of patterns.