Optimizing a Cypher query to improve performance

Question

The query I've written returns accurate results based on some random testing I've done. However, the query execution takes really long (7699.43 s) I need help optimising this query.

count(Person) -> 67895

count(has_POA) -> 355479

count(POADocument) -> 40

count(issued_by) -> 40

count(Company) -> 21

count(PostCode) -> 9845

count(Town) -> 1673

count(in_town) -> 9845

count(offers_services_in) -> 17107

All the entity nodes are indexed on Id's (not Neo4j IDs). The PostCode nodes are also indexed on PostCode.

MATCH pa= (p:Person)-[r:has_POA]->(d:POADocument)-[:issued_by]->(c:Company),
(pc:PostCode),(t:Town) WHERE r.recipient_postcode=pc.PostCode AND (pc)-
[:in_town]->(t) AND NOT (c)-[:offers_services_in]->(t) RETURN p as Person,r 
as hasPOA,t as Town, d as POA,c as Company

Much thanks in advance! -Nancy

Bruno Peres Bruno Peres · Accepted Answer · 2017-09-18T16:21:29

I made some changes in your query:

MATCH (p:Person)-[r:has_POA {recipient_code : {code} }]->(d:POADocument)-[:issued_by]->(c:Company),
    (pc:PostCode {PostCode : {PostCode} })-[:in_town]->(t:Town)
WHERE NOT (c)-[:offers_services_in]->(t)
    RETURN p as Person, r as hasPOA, t as Town, d as POA, c as Company

Since you are not using the entire path, removed pa variable
Moved the pattern existence check ((pc)-[:in_town]->(t)) from WHERE to MATCH.
Using parameters instead of the equality check r.recipient_postcode = pc.PostCode in where. If you are running the query in Neo4j Browser, you can set the parameters running the command :params {code : 10}.

Optimizing a Cypher query to improve performance

2 Answers