I have been playing around with a very small graph of 264,346 vertices and 733,846 edges. I bulk imported it into Titan using BatchGraph and the recommended settings. This worked out fine - it is surprisingly efficient.
I am now playing around with graph traversals and implemented a quick Dijkstra's algorithm in Java using the Java libraries. I should mention I have a local Cassandra node that my Titan database is running on - it is not the embedded one.
An average Dijkstra run takes 1+ minute for an average path in this graph. This is extremely long. I would expect a very slow Dijkstra run on such a graph to take under 10 seconds (with an in-memory graph average queries on this graph size would take well under 1 second).
What are the best practices for running such algorithms over Titan efficiently?
I will give parts of the simple Dijkstra implementation in case the way I am accessing vertices and edges is not the most efficient way.
Getting the graph instance:
TitanGraph graph = TitanFactory.open("cassandra:localhost");
Parts of the Dijkstra implementation (specifically involving graph access):
public int run(long src, long trg){
this.pqueue.add(new PQNode(0, src));
this.nodeMap.put(src, new NodeInfo(src, 0, -1));
int dist = Integer.MAX_VALUE;
while (!this.pqueue.isEmpty()){
PQNode current = this.pqueue.poll();
NodeInfo nodeInfo = this.nodeMap.get(current.getNodeId());
long u = nodeInfo.getNodeId();
if (u == trg) {
dist = nodeInfo.getDistance();
break;
}
if (nodeInfo.isSeen())
continue;
this.expansion++;
TitanVertex vertex = graph.getVertex(u);
for (TitanEdge out: vertex.getEdges()){
Direction dir = out.getDirection(vertex);
if (dir.compareTo(Direction.OUT) != 0 && dir.compareTo(Direction.BOTH) != 0){
continue;
}
TitanVertex v = out.getOtherVertex(vertex);
long vId = (long)v.getId();
NodeInfo vInfo = this.nodeMap.get(vId);
if (vInfo == null){
vInfo = new NodeInfo(vId, Integer.MAX_VALUE, -1);
this.nodeMap.put(vId, vInfo);
}
int weight = out.getProperty("weight");
int currentDistance = nodeInfo.getDistance() + weight;
if (currentDistance < vInfo.getDistance()){
vInfo.setParent(u);
vInfo.setDistance(currentDistance);
this.pqueue.add(new PQNode(currentDistance, vId));
}
}
nodeInfo.setSeen(true);
}
return dist;
}
How should I proceed trying to execute such algorithms efficiently over Titan?