Multiple OPTIONAL MATCH with COUNT doesn't seem to work

Question

I have an Item node, which is POSTED_BY a User node. Additionally, the Item may be LIKED by zero or more User nodes, and zero or more Comments may be POSTED_IN the Item node.

What I need to do is to query for a specific Item node (given the id), and retrieve:

The Item node itself
The User node that the Item was POSTED_BY
The count of the number of Users who LIKE the Item node
The count of the number of Comments POSTED_IN the Item node

If I try to query just 1-3, and exclude 4 for now, my query looks something like this:

MATCH (i:Item {id: 'ByvIzUdbZ'})-[:POSTED_BY]->(u:User) 
OPTIONAL MATCH (u2:User)-[:LIKES]->(i)  
RETURN i, u, COUNT(u2);

This returns the Item node, the User and the LIKES count correctly.

Now, to accommodate the 4th point, I modified the query to add another OPTIONAL MATCH, as below:

MATCH (i:Item {id: 'ByvIzUdbZ'})-[:POSTED_BY]->(u:User) 
OPTIONAL MATCH (u2:User)-[:LIKES]->(i) 
OPTIONAL MATCH (comment:Comment)-[:POSTED_IN]->(i) 
RETURN i, u, COUNT(u2), COUNT(comment);

With this, however, the counts for both User (u2) and Comment (comment) are incorrect. They seem to be getting "added up". So, if the Likes was actually 3, the likes and comment count are both returned as 6. Any tips on what I'm doing wrong here? Thanks.

cybersam cybersam · Accepted Answer · 2017-06-04T17:05:08

You are seeing the effect of cartesian products.

If the number of rows that would be generated by a MATCH clause on its own would be N, but the number of rows that already exist (say, from previous MATCH clauses) is M, then you end up with M*N rows.

To avoid this multiplicative effect (which also adversely hits performance and memory), you can perform aggregation immediately after the appropriate clauses. For example, in your case:

MATCH (i:Item {id: 'ByvIzUdbZ'})-[:POSTED_BY]->(u:User) 
OPTIONAL MATCH (u2:User)-[:LIKES]->(i)
WITH i, u, COUNT(u2) AS u2_count
OPTIONAL MATCH (comment:Comment)-[:POSTED_IN]->(i) 
RETURN i, u, u2_count, COUNT(comment) AS comment_count;

Multiple OPTIONAL MATCH with COUNT doesn't seem to work

2 Answers