I have a simple cypher query which is taking too long to complete.
Node Types
Person {PersonId, PersonEmailAddress}
Document {DocumentId}
Relationship Types: SENT, TO, CC, BCC
The concept is:
(p:Person)-[:SENT]->(d:Document)-[:TO|CC|BCC]->(anotherPerson:Person)
I am trying to get those person to whom any given person has sent maximum mails.
For example you can see in the following diagram that Person#1 has sent maximum mails (6 mails) to Person#77 and Person#615 and so on. So here I want (as mentioned in the below table) top 5 Person ids with count of number of mail sent to that person.
+------------+-----------------------+
| ReceiverId | NumberOfMailsReceived |
+------------+-----------------------+
| 77 | 6 |
| 615 | 6 |
| 101 | 4 |
| 247 | 4 |
| 252 | 4 |
+------------+-----------------------+
I'm trying the following query:
MATCH(p:Person{PersonId:1})-->(d) WITH DISTINCT d
MATCH (d)-->(rc)
RETURN rc, COUNT(rc) as c ORDER BY c DESC LIMIT 5
Here there is no performance issue with this query as there is only 9 documents the Person#1 has sent and only 15 person are involved as Receivers of this 9 documents.
But If I fire the same query for a different Person who has sent total nearly 56,500 documents, and total 869 (distinct) person are involved as Receivers, the query takes 43261 ms to complete
Cypher version: CYPHER 3.1, planner: COST, runtime: INTERPRETED. 21570218 total db hits in 43261 ms.
The PROFILE is as below
My neo4j browser shows Size: 5.16 GiB under Database section.
This are the configs I'm using:
dbms.memory.heap.initial_size=8G
dbms.memory.heap.max_size=8G
dbms.memory.pagecache.size=4g
Any suggestions or ideas to optimize the query ?
Thanks in advance.
EDIT
After updating Neo4j version from 3.1.1 to 3.2, the query still takes approx 23368 ms.