titan-hbase-solr graph load gremlin-server java.lang.OutOfMemoryError: GC overhead limit exceeded

Question

We are trying to create a huge graph (around 100,000 vertices) in titan using gremlin server remote connection. We have followed the example code available at https://github.com/pluradj/titan-tp3-driver-example to create remote connection to titan via gremlin server. We are able to create indices, vertices, edges query the simple graphs created without any problem;

However, when we try to create a huge graph using a generator (it creates vertices and edges directly in the server using remote connection established) , we are getting the following error:

 6041316 [gremlin-server-exec-6] WARN  org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor  - Exception processing a script on request [RequestMessage{, requestId=81f949ad-0e37-4293-bcaa-0714cb159c3b, op='eval', processor='', args={gremlin=g.V().has('idObj', 'OC97').next().addEdge('OC_LC', g.V().has('idObj', 'LC9643').next()), batchSize=64}}].
java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.codehaus.groovy.reflection.CachedClass$3.initValue(CachedClass.java:106)
    at org.codehaus.groovy.reflection.CachedClass$3.initValue(CachedClass.java:84)
    at org.codehaus.groovy.util.LazyReference.getLocked(LazyReference.java:49)
    at org.codehaus.groovy.util.LazyReference.get(LazyReference.java:36)
    at org.codehaus.groovy.reflection.CachedClass.getMethods(CachedClass.java:260)
    at groovy.lang.MetaClassImpl.addInterfaceMethods(MetaClassImpl.java:419)
    at groovy.lang.MetaClassImpl.fillMethodIndex(MetaClassImpl.java:342)
    at groovy.lang.MetaClassImpl.initialize(MetaClassImpl.java:3264)
    at org.codehaus.groovy.reflection.ClassInfo.getMetaClassUnderLock(ClassInfo.java:254)
    at org.codehaus.groovy.reflection.ClassInfo.getMetaClass(ClassInfo.java:285)
    at org.codehaus.groovy.reflection.ClassInfo.getMetaClass(ClassInfo.java:295)
    at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.getMetaClass(MetaClassRegistryImpl.java:261)
    at org.codehaus.groovy.runtime.InvokerHelper.getMetaClass(InvokerHelper.java:873)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.createPojoSite(CallSiteArray.java:125)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:166)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
    at Script72559.run(Script72559.groovy:1)
    at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:534)
    at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:374)
    at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233)
    at org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:102)
    at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:258)
    at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor$$Lambda$137/1500273035.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

The graph generation is fast in the begining and slows down gradually and fails around 31000 vertices throwing the above error.

We have tried changing the default cache parameters as below

cache.db-cache=true
cache.db-cache-clean-wait=0
cache.db-cache-time=10000
cache.db-cache-size=0.1

Also we have tried deactivating cache by setting cache.db-cache=false. But none of the steps worked for us.

#Our environment:
CDH 5.7.1
Titan 1.1.0-SNAPSHOT
Solr 4.10.3
HBase 1.2.0

Could you please guide us how to overcome this problem?

How many elements are you trying to create on a single commit? Try committing a smaller numbers of vertices/edges at a time rather than 100K in one shot. — Jason Plurad
Also make sure you use parameterized scripts tinkerpop.apache.org/docs/current/reference/… — Jason Plurad
The problem is almost certainly related to parameterization. — stephen mallette
@stephen: i can't see any difference when i change cache parametrization, always failing after creating arround 31000 vertices — bbary
@Jason Plurad: i am using remote connection to gremlin server so no commit transaction is needed i think. I can see every vertex created during execution without commiting. Nevertheless i am doing a graph traversal commit (g.tx().commit) each time i generate arround 10 vertices. The commands to create vertices and edges are already parametrized — bbary

bbary bbary · Accepted Answer · 2017-01-09T17:35:40

The problem was forgetting to use Parameterized Scripts http://tinkerpop.apache.org/docs/current/reference/#parameterized-scripts

Gremlin Server caches all scripts that are passed to it: using Parameterized Scripts reduces caching because only not common scripts are cached (g.V(x))

Not using Parameterized Scripts and using instead String.format for example (like we did) implies caching all gremlin scripts separately which is very expensive and causes an OutOfMemoryError

I hope this would help someone ;)

titan-hbase-solr graph load gremlin-server java.lang.OutOfMemoryError: GC overhead limit exceeded

1 Answers