0
votes

I'm using Neo4j version 3.0.7

I'm reading a list of edges from a dataset and I need to pass those edges batch-wise using the REST API. I used the following query format to create multiple nodes (if they already don't exist) and their relationships in Neo4j through a single Cypher query via the REST API. I obtain the two vertices of an edge and the node properties are set according to the vertex IDs of those vertices.

{
  "query":
    "MATCH (n { name: 0 }), (m { name:1 })
    CREATE (n)-[:X]->(m)
    WITH count(*) as dummy
    MATCH (n { name: 0 }), (m { name: 6309 })
    CREATE (n)-[:X]->(m)"
}

This approach works correctly for a batch of 10 edges but when I try to send a batch of 1000 edges (nodes and their relationships) through a single Cypher query, I get a StackOverflowError exception. Is there a better approach to achieve this task? Thank you for your help.

The error obtained from the response:

{
  "exception" : "StackOverflowError",
  "fullname" : "java.lang.StackOverflowError",
  "stackTrace" : [ "scala.collection.TraversableOnce$class.$div$colon(TraversableOnce.scala:151) ..."
}
1
If you use this approach to insert ~1000 nodes, you will get a huge Cypher query. The StackOverflowError might be a symptom of the parser failing on that query.Gabor Szarnyas
did my answer solve the problem? If so, can you please mark the answer as accepted?Gabor Szarnyas
i tried with the approach you mentioned but with that way it takes about 3 minutes to send a single edge so to send a batch of 1000 edges it takes hours. is there any other optimal approach than this way to send batches of edges with minimal time?sathya
3 minutes is quite a lot for inserting a single edge. Do you have lots of nodes (like millions) in your database? Anyways, you are using the name attribute for finding the nodes, which means it would be worth putting an index on this property. For example, if you have Person nodes, index them with CREATE INDEX ON :Person(name) (this example is taken from the Cypher reference card) - it should make quite a difference in the performance of the query.Gabor Szarnyas
thank you very much :).. I was able to upload a batch of edges within seconds by creating an index on the nodes.sathya

1 Answers

1
votes

You can use UNWIND to get a single query:

UNWIND [[0,1], [0,6309]] AS pair
MATCH (n {name: pair[0]}), (m {name: pair[1]})
CREATE (n)-[:X]->(m)

Insert your node pairs after UNWIND as a list of two-element lists. As the query uses the name property for finding the nodes, it is worth adding an index to it. For example, if you haven Person nodes, index them with:

CREATE INDEX ON :Person(name)

(See also the Cypher reference card.)