2
votes

I'm developing an application using Titan + DynamoDB Local as storage backend. Previously I used Berkeley as storage backend, which had great performance. Ever since switching though, I can't even use the application because everything takes ages.

A simple test case adds 30 vertices and commits the changes; This is what I get:

//Berkeley Backend
Committing took 0.092 seconds.

//DynamoDB Local Backend    
Committing took 8.684 seconds.

Subsequent commits take just as long. I know that dynamodb won't be as quick as Berkeley as it's just a wrapped SQL database intended for development, however it must be faster than this. I can only guess that somethig's wrong with my configuration, which on the other hand took straight from their repository. Running dynamo within memory only improves performance slightly.

I've already tried increasing the capacity read/write units with no luck. Does anyone have a moderately fast version of dynamo running locally?

Here's my current configuration:

#general Titan configuration
gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
storage.setup-wait=60000
storage.buffer-size=1024
# Metrics configuration - http://s3.thinkaurelius.com/docs/titan/1.0.0/titan-config-ref.html#_metrics
#metrics.enabled=true
#metrics.prefix=t
# Required; specify logging interval in milliseconds
#metrics.csv.interval=500
#metrics.csv.directory=metrics
# Turn off titan retries as we batch and have our own exponential backoff strategy.
storage.write-time=1 ms
storage.read-time=1 ms
storage.backend=com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager

#Amazon DynamoDB Storage Backend for Titan configuration
storage.dynamodb.force-consistent-read=true
# should be the graph name rexster/graphs/graph/graph-name
storage.dynamodb.prefix=v100
storage.dynamodb.metrics-prefix=d
storage.dynamodb.enable-parallel-scans=false
storage.dynamodb.max-self-throttled-retries=60
storage.dynamodb.control-plane-rate=10

# DynamoDB client configuration: credentials
storage.dynamodb.client.credentials.class-name=com.amazonaws.auth.BasicAWSCredentials
storage.dynamodb.client.credentials.constructor-args=access,secret

# DynamoDB client configuration: endpoint (Below, set to DynamoDB Local as invoked by mvn test -Pstart-dynamodb-local).
# You can change the endpoint to point to Production DynamoDB regions.)
storage.dynamodb.client.endpoint=http://localhost:1234

# max http connections - not recommended to use more than 250 connections in DynamoDB Local
storage.dynamodb.client.connection-max=250
# turn off sdk retries
storage.dynamodb.client.retry-error-max=0

# DynamoDB client configuration: thread pool
storage.dynamodb.client.executor.core-pool-size=25
# Do not need more threads in thread pool than the number of http connections
storage.dynamodb.client.executor.max-pool-size=250
storage.dynamodb.client.executor.keep-alive=60000
storage.dynamodb.client.executor.max-concurrent-operations=1
# should be at least as large as the storage.buffer-size
storage.dynamodb.client.executor.max-queue-length=1024

# 750 r/w CU result in provisioning the maximum equal numbers read and write Capacity Units that can
# be set on one table before it is split into two or more partitions for IOPS. If you will have more than one Rexster server
# accessing the same graph, you should set the read-rate and write-rate properties to values commensurately lower than the
# read and write capacity of the backend tables.

storage.dynamodb.stores.edgestore.capacity-read=100
storage.dynamodb.stores.edgestore.capacity-write=100
storage.dynamodb.stores.edgestore.read-rate=100
storage.dynamodb.stores.edgestore.write-rate=100
storage.dynamodb.stores.edgestore.scan-limit=10000

storage.dynamodb.stores.graphindex.capacity-read=100
storage.dynamodb.stores.graphindex.capacity-write=100
storage.dynamodb.stores.graphindex.read-rate=100
storage.dynamodb.stores.graphindex.write-rate=100
storage.dynamodb.stores.graphindex.scan-limit=10000

storage.dynamodb.stores.systemlog.capacity-read=10
storage.dynamodb.stores.systemlog.capacity-write=10
storage.dynamodb.stores.systemlog.read-rate=10
storage.dynamodb.stores.systemlog.write-rate=10
storage.dynamodb.stores.systemlog.scan-limit=10000

storage.dynamodb.stores.titan_ids.capacity-read=10
storage.dynamodb.stores.titan_ids.capacity-write=10
storage.dynamodb.stores.titan_ids.read-rate=10
storage.dynamodb.stores.titan_ids.write-rate=10
storage.dynamodb.stores.titan_ids.scan-limit=10000

storage.dynamodb.stores.system_properties.capacity-read=10
storage.dynamodb.stores.system_properties.capacity-write=10
storage.dynamodb.stores.system_properties.read-rate=10
storage.dynamodb.stores.system_properties.write-rate=10
storage.dynamodb.stores.system_properties.scan-limit=10000

storage.dynamodb.stores.txlog.capacity-read=10
storage.dynamodb.stores.txlog.capacity-write=10
storage.dynamodb.stores.txlog.read-rate=10
storage.dynamodb.stores.txlog.write-rate=10
storage.dynamodb.stores.txlog.scan-limit=10000

# elasticsearch config that is required to run GraphOfTheGods
index.search.backend=elasticsearch
index.search.directory=/tmp/searchindex
index.search.elasticsearch.client-only=false
index.search.elasticsearch.local-mode=true
index.search.elasticsearch.interface=NODE

Test Code:

private long start;

    @Test
    public void performanceTest() {
        Graph graph = GraphFactory.getDefault();

        for (int i=1; i<=30; i++) {
            graph.addVertex("myLabel");
        }

        startStopwatch();
        graph.tx().commit();
        stopStopwatch("Committing");
    }

    private void startStopwatch() {
        start = System.currentTimeMillis();
    }

    private void stopStopwatch(String opName) {
        System.out.println(opName + " took " + (System.currentTimeMillis() - start) / 1000.0 + " seconds.");
    }

Versions

  • Titan 1.0.0
  • DynamoDB Storage Backend 1.0.2
  • DynamoDB Local 1.11.86
  • Berkeley 1.0.0

Related

1

1 Answers

1
votes

I created an issue on GitHub to track reproduction of your perceived latency. I measure only 84 ms commit time on the command line when I run your code, and even if I include all of the setup and static Java initialization code, end-to-end the test takes only 5 seconds. Please try out the master branch.

final Graph graph = JanusGraphFactory.open(TestGraphUtil.instance.createTestGraphConfig(BackendDataModel.MULTI));
IntStream.of(30).forEach(i -> graph.addVertex(LABEL));
Stopwatch watch = Stopwatch.createStarted();
graph.tx().commit();
System.out.println("Committing took " + watch.stop().elapsed(TimeUnit.MILLISECONDS) + " ms");
TestGraphUtil.instance.cleanUpTables();

DynamoDB Local is a testing tool and you should not expect to see the same low latency that you do when using the DynamoDB service. Having said that, you will get the best performance out of DynamoDB Local when you use the inMemory option.