1
votes

I'm having some difficulty loading a Cypher file into Neo4J in Windows 10. The file in question is a 175 Mb .cql file filled with more than a million lines of nodes and edges (separated by semicolons) in the Cypher language -- CREATE [node], that sort of thing. For smaller items, I have been using an APOC command in the web browser:

call apoc.cypher.runFile('file:///<file path>')

but this is too slow for a million+ query file. I've created indexes for the nodes, and am currently running it through a command:

neo4j-shell -file <file path> -path localhost

but this is still slow. I was wondering, is there any way to speed up the intake?

Also, note that I am using an recent ONGDB build, rather than straight Neo4J; I do not believe this will make any substantial difference.

3

3 Answers

3
votes

If the purpose of your very large CQL file is simply to ingest data, then doing it purely in Cypher is going to be very slow (and may even cause an out-of-memory error).

If you are ingesting into a new neo4j DB, you should consider refactoring the data out of it and using the import command of neo4j-admin tool to efficiently ingest the data.

If you are ingesting into an existing DB, you should consider refactoring the data and logic out of the CQL file and using LOAD CSV.

2
votes

I ended up ingesting it using cypher-shell. It's still slow, but at least it does finish. Using it requires one to first open a Neo4J console then, in a second command line, use:

type <filepath>\data.cql | bin\cypher-shell.bat -a localhost -u <user> -p <password> --fail-at-end

This works for Windows 10, although it does take a while.

1
votes

When running a query outside of a transaction, neo4j will automatically start and commit a separate transaction for every query. You can speed things up by starting a transaction at the beginning, and committing and starting a new transactions every few thousand queries (memory use will go up with transaction size, so that's the limiting factor on how large the transactions can be).

Example queries.cypher (with transactions of size 3):

:begin
CREATE(n:PERSON { name: "Homer Simpson" })  
CREATE(n:PERSON { name: "Marge Simpson" })
CREATE(n:PERSON { name: "Abe Simpson" })    
:commit
:begin
CREATE(n:PERSON { name: "Bart Simpson" })
CREATE(n:PERSON { name: "Lisa Simpson" })
CREATE(n:PERSON { name: "Maggie Simpson" })
:commit

And then run cypher-shell < queries.cypher as usual.