4
votes

I am importing data with the new neo4j version (2.1.1) that allows for csv import. The csv import in question is dealing with bigrams.

The csv file looks like this;

$ head ~/filepath/w2.csv 
value,w1,w2,
275,a,a
31,a,aaa
29,a,all
45,a,an

I am pasting this into the neo4j-shell client to load in the csv;

neo4j-sh (?)$ USING PERIODIC COMMIT 
> LOAD CSV WITH HEADERS FROM "file:/Users/code/Downloads/w2.csv" AS line 
> MERGE (w1:Word {value: line.w1}) 
> MERGE (w2:Word {value: line.w2}) 
> MERGE (w1)-[:LINK {value: line.value}]->(w2);

The problem is that the shell now hangs and I have no idea what it is doing. I have checked with the interactive online environment and it does not seem to load any data. It seems unlikely that I have not hit a periodic commit moment yet as the shell has been working for half an hour now.

Is there any way for me to get a sign of life from the csv-loader? I would like to see some intermediate results to assist me in debugging what is going on. A solution to my current situation is welcome, but I am specifically interested in a way to debug the csvloader.

1
Perhaps try it with a smaller subset first? How long is your csv file?Michael Hunger
It seems to work for sets that are smaller than 10000 rows. We have also tried using an index but it still seems to hang somewhere. An alternative would be to split the csv file up and run the import commands seperately but that is what USING PERIODIC COMMIT should docantdutchthis

1 Answers

4
votes

I don't have a answer on the debugging/logging part.

Though maybe a hint to make your query a bit faster.

Do you have a index on the :Word.value?

You can try adding a index:

CREATE INDEX ON :Word(value);

UPDATE: If you want to follow the import process you can follow the disk size of the graph.db directory. It can roughly give you an idea about the progress.

On a unix machine:

du -s ~/neo4j-community-2.1.2/data/graph.db/