1
votes

I am using a TITAN-0.4.3, REXSTER 2.4 over Cassandra & Elasticsearch. My use case requires to bulk upload vertices and edges into the graph at a time. Right now I am calling commit() after adding each vertex+edge. I run some test by different commit frequencies, say every 10K, every 1000 and every 100, and the upload speed varies dramatically. I want to how can I arrive the optimum commit frequency , and on what parameters it is based on?

Any suggestions to increase performance on my use case? Size of DB is roughly 10M vertices.

1

1 Answers

1
votes

I'm not so sure I've ever come across a magic number that represents the optimal commit frequency. It seems to be largely dependent on the data loading strategy. I tend to start with 10000 as the commit size and work my way up from there. There's usually a bit more art to arriving at that number than science unfortunately.

You can however speed your load up in other ways by caching vertices that are commonly used to reduce index lookups, pre-sorting data to try to keep those vertices in cache, turning off locking if possible, etc. If you haven't read the "Powers of Ten" blog post series, Part I might be helpful as it addresses strategies for your graph size.