0
votes

im running the neo4j 2.3 on a CENTOS machine with 27g RAM. the db size is about 10g (for now), and i want neo4j to work as fast as possible

can you please advice if the following settings are ok? (i'm a newbie with JAVA)

dbms.pagecache.memory=15g
wrapper.java.initmemory=8192
wrapper.java.maxmemory=8192

also these are the JVM OS options:

java -XX:+PrintFlagsFinal -version | grep -e '\(Initial\|Max\)HeapSize'

   uintx InitialHeapSize   := 24696061952 {product}

   uintx MaxHeapSize      := 24696061952 {product}

*** im not sure why i see in the htop, that neo4j takes 24g of virtual memory.

1
Actually it is very easy to explain: 8GB + 15GB = 24GB -> 15*1024^3 bytes +8196*1024^2 bytes = 24.700.256.256 bytesMichael Hunger

1 Answers

4
votes

I assume that you are using latest stable Neo4j version (2.3.1).

Links that you might be interested in:

Getting best performance from your Neo4j database can be divided into several parts:

  1. Machine configuration
  2. Database configuration
  3. Application configuration

TL;DR; - your settings are OK.

Machine configuration

What you need to check is:

  • Good disks (SSD preferable). Slow disks can kill performance. There is great post by Struct: Neo4j Write Throughput on Linux ext4 Filesystems. Also: Linux filesystem tuning by official Neo4j docs.
  • Maximum open files (doc).
  • Network. Your application and server should be located in one local network. Driver use HTTP REST API to talk with Neo4j database. So, if no you are not in local network, then performance can be killed by long requests.

Server configuration

Neo4j by default is smart enough to auto-determine settings, based on your machine.

Heap size configuration recommendations - doc. Basic rule there - there is no need to give to give more memory then is needed.

Pagecache - doc.

It caches the Neo4j data as stored on the durable media.

In neo4j 2.3.0 object cache is completely removed and pagecache is only caching mechanism now. Basic rule there - if you can fit your database in RAM, then set this property to to size of database (maybe, add some +10-20% on top of it). This will allow Neo4j to load whole store into RAM.

Other Neo4j configuration settings

Application

So, at application you should ensure several things:

  • Always execute cypher with specified parameters. Now, in 2.3.0, Cypher is compiled down to bytecode on first execution. If your query is parametrized, then it allows database to reuse already compiled Cypher version. Raw (first) Cypher query execution performance can be quite bad.

  • Cache sharding. If your database is large enough, and is not fitting to the memory AND you are using clustered deployment, then you can gain benefit from using cache sharding. Basically, what you need is to do - route request to the same data, to the same server. This will allow Neo4j to keep most-used data in cache. And you will end up with situation, when each server has it's own piece of data in cache.

Other

Other settings that I found interesting: