0
votes

I am using the Spring Data Neo4j 4.0.0 with Neo4j 2.2.1 and I am trying to import a timetree-like object with 2 levels under the root. The saved object is built and saved at the end and at some point of the saving process, I got this StackOverFlow error:

Exception in thread "main" java.lang.StackOverflowError
        at java.lang.Character.codePointAt(Character.java:4668)
        at java.util.regex.Pattern$CharProperty.match(Pattern.java:3693)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
        at java.util.regex.Pattern$Branch.match(Pattern.java:4500)
        at java.util.regex.Pattern$Branch.match(Pattern.java:4500)
    at java.util.regex.Pattern$Branch.match(Pattern.java:4500)
    at java.util.regex.Pattern$BranchConn.match(Pattern.java:4466)
    at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
    at java.util.regex.Pattern$Curly.match0(Pattern.java:4177)
    at java.util.regex.Pattern$Curly.match(Pattern.java:4132)
    at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
    at java.util.regex.Pattern$Branch.match(Pattern.java:4502)
    at java.util.regex.Pattern$Branch.match(Pattern.java:4500)
    at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3715)
    at java.util.regex.Pattern$Start.match(Pattern.java:3408)
    at java.util.regex.Matcher.search(Matcher.java:1199)
    at java.util.regex.Matcher.find(Matcher.java:618)
    at java.util.Formatter.parse(Formatter.java:2517)
    at java.util.Formatter.format(Formatter.java:2469)
    at java.util.Formatter.format(Formatter.java:2423)
    at java.lang.String.format(String.java:2792)
    at org.neo4j.ogm.cypher.compiler.IdentifierManager.nextIdentifier(IdentifierManager.java:48)
    at org.neo4j.ogm.cypher.compiler.SingleStatementCypherCompiler.newRelationship(SingleStatementCypherCompiler.java:71)
    at org.neo4j.ogm.mapper.EntityGraphMapper.getRelationshipBuilder(EntityGraphMapper.java:357)
    at org.neo4j.ogm.mapper.EntityGraphMapper.link(EntityGraphMapper.java:315)
    at org.neo4j.ogm.mapper.EntityGraphMapper.mapEntityReferences(EntityGraphMapper.java:262)
    at org.neo4j.ogm.mapper.EntityGraphMapper.mapEntity(EntityGraphMapper.java:154)
    at org.neo4j.ogm.mapper.EntityGraphMapper.mapRelatedEntity(EntityGraphMapper.java:524)
    at org.neo4j.ogm.mapper.EntityGraphMapper.link(EntityGraphMapper.java:324)
    at org.neo4j.ogm.mapper.EntityGraphMapper.mapEntityReferences(EntityGraphMapper.java:262)
    at org.neo4j.ogm.mapper.EntityGraphMapper.mapEntity(EntityGraphMapper.java:154)
    at org.neo4j.ogm.mapper.EntityGraphMapper.mapRelatedEntity(EntityGraphMapper.java:524)
...

Thank you in advance and your suggestion would be really appreciated!

2

2 Answers

2
votes

SDN 4 isn't really intended to be used to batch import your objects into Neo4j. Its an Object Graph Mapping framework for general purpose Java applications, not a batch importer (which brings its own specific set of problems to the table). Some of the design decisions to support the intended use-case for SDN run contrary to what you would do if you were trying to design a purpose-built ETL. We are also constrained by the performance of Neo4j's HTTP Transactional endpoint, which although by no means slow in absolute terms, cannot hope to compete with the Batch Inserter for example.

There are some improvements to performance we will be making in the future and when the new binary protocol for Neo4j is released (2.3), we will be plugging that in as our transfer protocol. We expect this to improve transfer speeds to and from the database by at least an order of magnitude. However, please don't expect these changes to radically alter the behavioural characteristics of SDN 4. While a future version might be able load a few thousand nodes much faster than it can currently, it still won't be an ETL tool, and I wouldn't expect it to be used as such.

0
votes

After some hours of trial and error, finally I found that I need to limit my save depth level.

Previously, I didn't specify the depth level and the saved object was growing larger and larger as the insertion of its children also ran concurrently. So, after giving a depth of 1 on every save method, I finally get rid of the StackOverFlow error. And, by not saving regularly (I put all the objects in an ArrayList and save them all at the end), I get 1 minute performance gain (from 3.5 minutes to 2.5 minutes) for importing ca. 1000 nodes (with relationships).

Nevertheless, the performance is still not satisfying yet since I could import over 60,000 data in just less than 1 minute with my previous MongoDB implementation. I don't know if it is because of the SDN4 and if it could be faster with Embedded API. I'm really curious if anyone has done any benchmarking on SDN4 and Embedded API.