12
votes

I have been trying to run this query as recommended in the neo4j google group and in other sources online:

START n = node(*) MATCH n-[r?]-() WHERE ID(n)>0 DELETE n, r;

in order to delete all nodes and relationships between tests. When I do so from the console, I run out of java heap space. When I do so from python (using the newish graph_db.clear(), which appears uses the same query), I get a "SystemError: None" which, I assume, is the same java heap space error. I have a database with 500k nodes, only 5k relationships, and 7M properties. I am running on a Mac laptop (10.6.8) with 8GB RAM using neo4j-1.8.1. I guess I am a bit surprised that deleting nodes (with essentially no relationships, so very small subgraphs) would exceed the java heap space, but I am pretty naive about how neo4j works. Any suggestions in how to go forward are appreciated. I do know that rm -rf in the data directory and starting from scratch will work, but I thought there might be a less-drastic solution.

[cross-posted to neo4j google groups]

6
Paging after a WITH is even more convenient and sensible: START n = node(*) MATCH n-[r?]-() WITH n,r LIMIT 10000 DELETE n, r; - Michael Hunger
@MichaelHunger shouldn't it be: START n = node(*) WITH n LIMIT 10000 MATCH n-[r?]-() DELETE n, r;? - joewhite86

6 Answers

19
votes

The cypher statement above causes all nodes (besides the root node with ID 0) to be instantiated before deletion in one single transaction. This eats up too much memory when done with 500k nodes.

Try to limit the number of nodes to delete to something around 10k-50k, like e.g.:

START n = node(*) 
MATCH n-[r?]-() 
WHERE (ID(n)>0 AND ID(n)<10000) 
DELETE n, r;

START n = node(*) 
MATCH n-[r?]-() 
WHERE (ID(n)>0 AND ID(n)<20000) 
DELETE n, r;

etc.

However, there's nothing wrong with removing the entire database directory, it's good practice.

10
votes

According to neo4j documentation, the deletion of a graph isa done through:

MATCH (n)
OPTIONAL MATCH (n)-[r]-()
DELETE n,r;

To avoid the java heap space error, I conbined this code with LIMIT:

MATCH (n)
OPTIONAL MATCH (n)-[r]-()
WITH n,r LIMIT 100000 DELETE n,r;

It works to reduce node number and eventually lets use the first, recomended and more general code.

7
votes

The Question mark does not work anymore. Use Optional Match..the below should work.

               START n = node(*) 
               OPTIONAL MATCH n-[r]-() 
               WHERE (ID(n)>0 AND ID(n)<10000) 
               DELETE n, r;
4
votes

As of Neo4j 2.3.3, a new way of removing node and relationship had been introduced. See 2.3.3 Docs.

For example, you could do:

MATCH(n) DETACH DELETE n;
0
votes

You could increase heap space in your neo4j properties and enable gc log and watch the rise of heap space if it really near the upper limit. page cache size needs to be reduced or increased depending on your initial size. ...reduce/increase it and check effect on load time. neo4j is memory hungry... need bigger heap size as much u can get.

0
votes

I found a better solution in the Neo4J knowledge base [1]:

CALL apoc.periodic.iterate(
    "MATCH (n) RETURN n",
    "DETACH DELETE n",
    {batchSize:1000}
)
YIELD batches, total RETURN batches, total

[1] - https://neo4j.com/developer/kb/large-delete-transaction-best-practices-in-neo4j/