Neo4j writes slow down after a few thousand records are written.
Two indexes exist to speed query up. Also, using EXPLAIN I know that each query is a constant time process.
indexOrder = schema.indexFor(ORDER)
.on("id")
.create();
indexShop = schema.indexFor(SHOP)
.on("domain")
This is the query I use:
WITH {json} as log
WITH log.order as order, log.shop as shop
MERGE (s:Shop {domain:shop.domain})
ON CREATE SET s=shop
MERGE (s)-[:Scored]->(r:Order {id:order.id})
ON CREATE SET r=order
Here is how I commit store it to DB:
private void log() {
try (Transaction tx = graphDb.beginTx()) {
for (Map map : list) {
graphDb.execute(query,
singletonMap("json", map));
}
list = new ArrayList<>();
tx.success();
}
}
And I call the above when I have 1k logs.
list.add(map);
count++;
if (count % 1000 == 0) {
log();
System.out.println(count);
}
Additional info: I use these config settings:
.setConfig(GraphDatabaseSettings.pagecache_memory, "512M")
.setConfig(GraphDatabaseSettings.string_block_size, "60")
.setConfig(GraphDatabaseSettings.array_block_size, "300")
This system works to 200k entries if its all done within one transaction but then runs into memory issues.
So, why does the 1k entries/transaction approach grind to a halt after just 5 transactions (5k entries) are commited to the database?
How do I fix the problem?