The data comes to the system continuosly with rate 300-500 TPS. I need to import it to neo4j with the following scheme:
- If N node does not exist, create it
- If the relation N-[rel:rel_type]->X does not exist, create it
- Increment rel.weight
It seems to be impossible to solve the problem using REST batch. Different cypher queries are too long because they generate many small transactions.
Gremlin works much faster. I collect parameters for gremlin script in array and execute it as a batch. But even though I could hardly reach the speed of 300 TPS.
I should mention that besides there will be a flow of queries ~500 TPS:
START N=node(...) MATCH N-[rel:rel_type]->X return rel.weight,X.name;
The heap size is set to 5 Gb. Additional options:
-XX:MaxPermSize=1G -XX:+CMSClassUnloadingEnabled -XX:+UseParallelGC -XX:+UseNUMA
What is optimal way and configuration for importing such kind of data?