1
votes

I am currently running some simple cypher queries (count etc) on a large dataset (>10G) and am having some issues with tuning NE04J.

The machine running the queries has 4TB of ram, 160 cores and is running Ubuntu 14.04/neo4j version 2.3. Originally I left all the settings as default as it is stated that free memory will be dynamically allocated as required. However, as the queries are taking several minutes to complete I assumed this was not the case. As such I have set various combinations of the following parameters within the neo4j-wrapper.conf:

wrapper.java.initmemory=1200000 
wrapper.java.maxmemory=1200000
dbms.memory.heap.initial_size=1200000
dbms.memory.heap.max_size=1200000
dbms.jvm.additional=-XX:NewRatio=1

and the following within neo4j.properties:

use_memory_mapped_buffers=true
neostore.nodestore.db.mapped_memory=50G
neostore.relationshipstore.db.mapped_memory=50G
neostore.propertystore.db.mapped_memory=50G
neostore.propertystore.db.strings.mapped_memory=50G
neostore.propertystore.db.arrays.mapped_memory=1G

following every guide/Stackoverflow post I could find on the topic, but I seem to have exhausted the available material with little effect.

I am running queries through the shell using the following command neo4j-shell -c < "queries/$1.cypher", but have also tried explicitly passing the conf files with -config $NEO4J_HOME/conf/neo4j-wrapper.conf (restarting the sever everytime I make a change).

I imagine that I have missed something silly which is causing the issue, as there are many reports of neo4j working well with data of this size, but cannot think what it could be. As such any help would be greatly appreciated.

1
There are many other reasons than not having enough memory for queries to be slow: missing indices, hitting too many nodes, etc. Having you tried using EXPLAIN / PROFILE on your queries?Frank Pavageau
The sever is currently being rebooted, I will give this a go when it is finished and report back on the result.ben steer
May we know how the accepted answer solved your problem? Was it the page cache that helped or did you find something else?Haoyang Feng
The page cache seemed to make a marginal difference as did the addition of indexes. However, the real improvement seemed to come from manipulating the JVM, garbage collection and scheduler. If you are interested I can link you to a full description of the changes that were made?ben steer

1 Answers

2
votes

Type :SCHEMA in your neo4j browser to show if you have indexes.

Share a couple of your queries.

In the neo4j.properties file, you need to set the dbms.pagecache.memory setting to about 1.5x the size of your database files. In your example, you can set it to 15g