When inserting tens of thousands of nodes and edges into a cassandra-backed tinkerpop I see basically all services are mostly idle except the Gremlin Server. That is, the client connected to the websocket and sending the Gremlin formatted commands is not consuming much CPU time and neither is Cassandra or ElasticSearch. Gremlin Server, on the other hand, is consuming several CPUs (on a rather beefy machine with dozens of cores and hundreds of gigabytes of RAM).
Increasing the number of GS worker threads doesn't have a positive impact. Increasing the number of simultaneous websocket requests permitted (a client setting) also does not help. Oddly, an unbounded number of concurrent websocket requests results in data failing to be inserted without any HTTP error message responses.
The working theory is that gremlin server's bottleneck is evaluation of the Gremlin commands (g.addV, etc). Does anyone have experience getting high ingest rates using the websocket plugin or is it necessary for me to write my own JVM langauge plugin that works on binary data to avoid parsing and evaluation of strings?
EDIT: The scripts are batches of up to 100 statements of either vertex insertions or edge/vertex/edge insertions:
The vertex insertions:
graph.addVertex(label, tyParam, 'ident', vertexName, param1, val1, param2, val2 ...) ;
graph.addVertex(...) ;
...
For triples of edge, vertex, edge:
edgeNode = graph.addVertex(...) ;
g.V().has('ident',var).next().addEdge(var2,edgeNode) ;
edgeNode.addEdge(var3, g.V().has('ident',var4).next())
'ident' is node indexed so that .has should be fast. Sadly, the dataset includes edges for sources or destinations that do not exist, causing "FastNoSuchElementException" errors. In error cases we split the set of statements in half and retry the script as two smaller insertion attempts. For example, a script of 50 edge/vertex/edge insertion statements failing becomes two scripts of 25 and this process continues all the way down to a script with a single e/v/e insertion where any failure is ignored.
N.B. I'm using Titan 1.0.