We have a graph consisting of data sets (~500) and users (~15). When I tried to clear both sets of nodes using the following query, the memory usage of Neo4J (v2.3.1) went up to over 1.5 GB and the query was pretty slow.
MATCH (ds:DataSet), (u:User)
OPTIONAL MATCH (ds)-[r1]-(), (u)-[r2]-()
DELETE ds, u, r1, r2
Surprisingly splitting the query into the following two queries:
MATCH (ds:DataSet) OPTIONAL MATCH (ds)-[r]-() DELETE ds, r
MATCH (u:User) OPTIONAL MATCH (u)-[r]-() DELETE u, r
kept the memory at ~240 MB. The initial memory consumption after starting is at around ~230 MB.
My question is whether there is a conceptual issue with the first cypher query. Is it suppose to be very inefficient to delete multiple sets of nodes at the same time?
tl/dr:
Both node sets (users and data sets) do not overlap but are linked together, i.e. a user node be connected with a data set node via relationships.