How to return all movies with the same crew in neo4j?

Question

As I am new to neo4j, I am currently experimenting with the neo4j movie database sample.

I was wondering what the best way was to compare subgraphs and relationships, for example, how to get all movies with identical crew.

Based on other questions here on stackoverflow, I got it to work to return all movies where specific actors acted in together:

WITH ['Tom Hanks', 'Meg Ryan'] as names
MATCH (p:Person)
WHERE p.name in names
WITH collect(p) as persons
WITH head(persons) as head, tail(persons) as persons
MATCH (head)-[:ACTED_IN]->(m:Movie)
WHERE ALL(p in persons WHERE (p)-[:ACTED_IN]->(m))
RETURN m.title

But, how could I retrieve movies with identical actors without specifying the actors names?

InverseFalcon InverseFalcon · Accepted Answer · 2017-12-05T17:26:45

Some alternate approaches that may be more efficient (check using PROFILE):

Only match from movies to actors once, then collect them and UNWIND them the number of times you need to generate cross products, then filter out and compare. This saves you from having to hit the db multiple times, since all you need is the data obtained from the first match. I'm going to borrow Bruno's query and tweak it a bit.

// match the first movie and all its actors
match (m1:Movie)<-[:ACTED_IN]-(a1:Person)
// order actors by name
with m1, a1 order by a1.name
// store ordered actors into actors1 variable
with m1, collect(a1) as actors1
// collect this data into a single collection
with collect({m:m1, actors:actors1}) as data
// generate cross product of the data
unwind data as d1
unwind data as d2
with d1, d2
// prevent comparison against the same movie, or the same pairs in different orders
where id(d1.m) < id(d2.m) and d1.actors = d2.actors
// return movies that have the same actors
return d1.m, d2.m

Alternately, you can group movies by their actors and only return movies that are grouped accordingly:

// match the first movie and all its actors
match (m1:Movie)<-[:ACTED_IN]-(a1:Person)
// order actors by name
with m1, a1 order by a1.name
// store ordered actors into actors1 variable
with m1, collect(a1) as actors1
// group movies with their sets of actors
with collect(m1) as movies, actors1
// only interested in where multiple movies have the same actor sets
where size(movies) > 1
// return the collection of movies with the same actors
return movies

The second query is likely better here, as you get all movies with the same cast, rather than getting pairs per row.

How to return all movies with the same crew in neo4j?

2 Answers