Dear Cypher/Neo4j experts,
I have the following part of a graph where I have Countries, Cities, Companies and Projects modelled. I have the following Cypher query where I get the Country nodes, as well as the unique identifiers of the contained city nodes, a count of the LOCATED_IN and a count of the MAY_BE_EXECUTED_IN relationships
MATCH (country:Country)
OPTIONAL MATCH (country)-[:CONTAINS]->(city:City)
OPTIONAL MATCH (:Project)-[may_be_executed_in:MAY_BE_EXECUTED_IN]->(country)
OPTIONAL MATCH (:Company)-[located_in:LOCATED_IN]->(country)
RETURN DISTINCT country {.*
, city: COLLECT(DISTINCT city.identifier)
, MAY_BE_EXECUTED_IN_Count: COUNT(DISTINCT may_be_executed_in)
, LOCATED_IN_Count: COUNT(DISTINCT located_in)
}
ORDER BY country.identifier ASCENDING
There are 76 Country nodes and about 10000 Company nodes in the database. When I run the query it takes about 30 to 40 seconds, when I leave out the MAY_BE_EXECUTED_IN_Count: COUNT(DISTINCT may_be_executed_in)
part the query completes in approx 40 milliseconds. I realize that the resources allocated to neo4j play a role, but my guess is that my query is not good. Any help would be appreciated.