2
votes

I am trying to import a large amount of data from csv to neo4j using neo4j-rest java api. To avoid out of memory exceptions , I am using periodic commit , so a sample java code would be :

// just to let you know what classes I am using
    import org.neo4j.rest.graphdb.query.CypherTransaction;
    import org.neo4j.rest.graphdb.query.CypherTransaction.Statement;
    import org.neo4j.rest.graphdb.query.CypherTransaction.Result;
    import org.neo4j.rest.graphdb.query.CypherTransaction.ResultType;

private static final String CREATE_USER = 
    " USING PERIODIC COMMIT 10000 LOAD CSV WITH HEADERS FROM " +
                "\"URL\"   AS line  WITH line\n" +
                " CREATE (u:USER{id:toInt(line.customer_key)})";

//create USER Node
Statement userStatement = new Statement(CREATE_USER, null, ResultType.rest, false);

CypherTransaction periodicCommitTransaction = new CypherTransaction(dbPath, CypherTransaction.ResultType.rest);
            periodicCommitTransaction.addAll(userStatement);
            periodicCommitTransaction.commit();

Now my question is how should I handle transaction rollbacks in periodic commits? I know that the periodic commit statements can not be run in an open transaction and they should be committed right after the request is sent. This means there is no way to rollback if something goes wrong. I guess this is a common problem in batch insertions , so how should I handle such rollbacks? Should I drop my db in neo4j and try to start the whole process from the beginning? Any thoughts?

1
use a parameter for URLMichael Hunger
yes I am using it in my real code :-) I just changed it to put it here in the code. Thanks Michael.Lina

1 Answers

1
votes

Correct, PERIODIC COMMIT commits every x-rows by default.

The only thing you can do is to mark your "in-flight" nodes with a certain label like :Importing and remove that label if your import was successful, or remove all nodes and their relationships if something failed. You have to batch it though.

MATCH  (n:Importing) 
WITH n LIMIT 10000 
DETACH DELETE n 
RETURN count(*);