1
votes

New to cypher, and I'm trying to load in a csv of a tree structure with 5 columns. For a single row, every item is a node, and every node in column n+1 is a child of the node in column n.

Example:

Csv columns: Level1, Level2, Level3, Level4, Level5

Structure: Level1_thing <--child_of-- Level2_thing <--child_of-- Level3_thing etc...

The database is non-normalized, so there are many repetitions of node names in all the levels except the lowest ones. What's the best way to load in this csv using cypher and create this tree structure from the csv?

Apologies if this question is poorly formatted or asked, I'm new to both stack overflow and graph DBs.

3

3 Answers

0
votes

IIUC, you can use the LOAD CSV function in Cypher to load both nodes and relationships. In your case you can use MERGE to take care of duplicates. Your example should work in this way, with a bit of pseudo-code:

LOAD CSV with HEADERS from "your_path" AS row
MERGE (l1:Label {prop:row.Level1}
...
MERGE (l5:Label {prop:row.Level1}
MERGE (l1)<-[CHILD_OF]-(l2)<-...-(l5)

Basically you can create on-the-fly nodes and relationships while reading from the .csv file with headers. Hope that helps.

0
votes

What you are searching is the MERGE command.

To do your script you have to do it in two phases for an optimal execution

1) Create nodes if they don't already exist

USING PERIODIC COMMIT 
LOAD CSV WITH HEADERS FROM "file:///my_file.csv" AS row
MERGE (l5:Node {value:row.Level5})
MERGE (l4:Node {value:row.Level4})
MERGE (l3:Node {value:row.Level3})
MERGE (l2:Node {value:row.Level2})
MERGE (l1:Node {value:row.Level1})

2) Create relationships if they don't already exist

USING PERIODIC COMMIT 
LOAD CSV WITH HEADERS FROM "file:///my_file.csv" AS row
MATCH (l5:Node {value:row.Level5})
MATCH (l4:Node {value:row.Level4})
MATCH (l3:Node {value:row.Level3})
MATCH (l2:Node {value:row.Level2})
MATCH (l1:Node {value:row.Level1})
MERGE (l5)-[:child_of]->(l4)
MERGE (l4)-[:child_of]->(l3)
MERGE (l3)-[:child_of]->(l2)
MERGE (l2)-[:child_of]->(l1)

And before all, you need to create a constraint on your node to facilitate the work of the MERGE. On my example it will be :

CREATE CONSTRAINT ON (n:Node) ASSERT n.value IS UNIQUE;
0
votes

If the csv-file does not have a header line, and the column sequence is fixed, then you can solve the problem like this:

LOAD CSV from "file:///path/to/tree.csv" AS row
// WITH row SKIP 1 // If there is a headers, you can skip the first line
// Pass on the columns:
UNWIND RANGE(0, size(row)-2) AS i
  MERGE (P:Node {id: row[i]})
  MERGE (C:Node {id: row[i+1]})
  MERGE (P)<-[:child_of]-(C)
RETURN *

And yes, before that it's really worth adding an index:

CREATE INDEX ON :Node(id)