I'm experiencing surprisingly slow retrieval of results with ResourceIterator<Node>
when I get results from Cypher query execution in Java. next()
command takes on average 156ms, with standard deviation of 385! Is this behavior expected, or am I doing something wrong? Can anybody suggest a more efficient way of achieving the same thing?
Graph structure
I have the following graph layout, where Point nodes have LinksTo relations to other points:
Node:Point
Properties:
- idPoint (new style schema unique constraint on this property)
- x (new style schema index on this property)
- y (new style schema index on this property)
Relation:LinksTo
Properties:
- idLink
- length
(...relations don't even play a role in my question...)
Graph statistics:
- # of nodes: 890,000
- # of relations: 910,000
Old code
(Using Neo4j 2.0.0 stable with Oracle Java 7 on Ubuntu)
(Basically this code searches for nodes(points) in a 60x60 square around the given point.)
GraphDatabaseService graphDB = new GraphDatabaseFactory ( ).newEmbeddedDatabase ("points_db");
ExecutionEngine engine = new ExecutionEngine (graphDB);
for (Coordinate c : coords) // coords holds 500 different coordinates
{
int size = 30;
int xMin = c.x - size;
int xMax = c.x + size;
int yMin = c.y - size;
int yMax = c.y + size;
String query = "MATCH (n:POINT) " +
" WHERE n.x > " + xMin +
" AND n.x < " + xMax +
" AND n.y > " + yMin +
" AND n.y < " + yMax +
"RETURN n AS neighbour";
ExecutionResult result = engine.execute (query); // command1
ResourceIterator<Node> ri = result.columnAs ("neighbour"); // command2
while (ri.hasNext ( ))
{
Node n = ri.next ( ); // command3
// ... some code ...
}
}
Measurements
command1 average execution time: 7.5 ms
command2 average execution time: <1 ms
command3 average execution time: 156 ms (with 358 standard deviation)
(Measurements taken with 500 iterations(different coordinates) and on average 6 points are found in each iteration. Measurements are repeatable.)
EDIT 1 (As suggested by Luanne and Michael)
New, faster code with parameterization
(Using Neo4j 2.0.0 stable with Oracle Java 7 on Ubuntu)
(Basically this code searches for nodes(points) in a 60x60 square around the given point.)
GraphDatabaseService graphDB = new GraphDatabaseFactory ( ).newEmbeddedDatabase ("points_db");
ExecutionEngine engine = new ExecutionEngine (graphDB);
Map<String, Object> params = new HashMap<> ( );
int size = 30;
String query = "MATCH (n:POINT) " +
" WHERE n.x > {xMin}" +
" AND n.x < {xMax}" +
" AND n.y > {yMin}" +
" AND n.y < {yMax}" +
" RETURN n AS neighbour";
for (Coordinate c : coords) // coords holds 500 different coordinates
{
params.put ("xMin", (int) c.x - size);
params.put ("xMax", (int) c.x + size);
params.put ("yMin", (int) c.y - size);
params.put ("yMax", (int) c.y + size);
ExecutionResult result = engine.execute (query, params); // command1
ResourceIterator<Node> ri = result.columnAs ("neighbour"); // command2
while (ri.hasNext ( ))
{
Node n = ri.next ( ); // command3
// ... some code ...
}
}
Measurements
command1 average execution time: 1.7 ms
command2 average execution time: <1 ms
command3 average execution time: 112 ms (with 270 standard deviation)
(Measurements taken with 500 iterations(different coordinates) and on average 6 points are found in each iteration. Measurements are repeatable.)
negihbour
. – Michael Hunger