Fastest way to get all Neo4j nodes and relationships?

Question

What is the fastest way to get all unordered nodes and relationships from a running Neo4j 2.x server into a program?

Cypher MATCH n RETURN n is too slow for my use case (say we have >10M nodes to extract).

The shell command dump seems interesting but it requires some hack to call from a source code. Are there any benchmark available of dump?

Any advices appreciated!

--EDIT--

I execute the query thought the REST endpoint of a local Neo4j server (thus no network effect) with a query like MATCH n RETURN n SKPI 0 LIMIT 50000. My db is Neo4j 2.0.3 populated with 100k nodes of 1 integer property and no relationship. Computer: SSD with read speed 1.3+ Mo/s and CPU i7 1.6Ghz, JVM -Xmx2g. It takes ~4s to retreive 50k nodes:

curl -s -w %{time_total} -d"query=match n return n limit 50000" -D- -onul: http://localhost:7474/db/data/cypher

HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Access-Control-Allow-Origin: *
Content-Length: 63394503
Server: Jetty(9.0.z-SNAPSHOT)

4,047

How do you execute match (n) return n? The tx endpoint should be fast enough, it is rather limited by disk speed of loading the properties and probably network, if you only need the structure you can use match (n) return id(n) as ID — Michael Hunger

david_p david_p · Accepted Answer · 2015-02-10T09:16:11

What you want is enable HTTP chunked encoding (aka Steaming) to allow Neo4j to start sending you results without holding them all in memory. You do this by adding the Accept: application/json;stream=true HTTP request header.

This requests does the trick:

curl -i -o streamed.txt -XPOST \
  -d'{ "query":"MATCH n RETURN n" }' \
  -H 'accept:application/json;stream=true' \
  -H 'content-type:application/json' \
  'http://localhost:7474/db/data/cypher'

On a side note, if you want to start parsing the response on your side before having received the whole content (to avoid filling up your memory / hard drive), you may want to look into JSON stream parsing.

Fastest way to get all Neo4j nodes and relationships?

2 Answers