0
votes

I have a file with relationships in this format:

!comment
!comment
nodeID   nodeName   edgeType   nodeID
nodeID   nodeName   edgeType   nodeID
nodeID   nodeName   edgeType   nodeID

I want to import the nodes and edges of that file into my neo4j database.
I tried the following steps:

  1. Create a unique constraint on node ids
  2. Read file, skip comment rows, create unique nodes from each row (skip row if node already exists)
  3. Read file, skip comment rows, create edges from each row
// Each node id is unique
CREATE CONSTRAINT ON (n:Node) ASSERT n.id IS UNIQUE

// For each row not starting by "!", create node if it doesn't exist
LOAD CSV FROM "file:///relationships.tsv" AS row
FIELDTERMINATOR '\t'
WITH row
WHERE NOT row =~ '^!.*'
CREATE (:Node {id: row[0], name: row[1]})

// For each row not starting by "!", create edge
LOAD CSV FROM "file:///relationships.tsv" AS row
FIELDTERMINATOR '\t'
WITH row
WHERE NOT row =~ '^!.*'
MATCH (n:Node), (m:Node)
WHERE n.id = row[0] AND m.id = row[3]
WITH n, m, row
CASE row[2]
  WHEN 'F' THEN
    CREATE UNIQUE (m)-[:Edge {type: 'friend'}]->(n)
  WHEN 'P' THEN
    CREATE UNIQUE (m)-[:Edge {type: 'partner'}]->(n)
END

The code above doesn't work. Being new to cypher I'm not sure what I am doing wrong. I would like ultimately to merge steps 2 and 3 to read the file once and be done with it. How can I import this data efficiently?

1
Can you explain what is not working ? Do you have any result after your import ?logisima
The regex isn't working because row is a list of strings.D.ldo

1 Answers

0
votes

[UPDATED twice]

This version of your 3rd query should work:

LOAD CSV FROM "file:///relationships.tsv" AS row
FIELDTERMINATOR '\t'
WITH row
WHERE NOT row[0] STARTS WITH '!'
MATCH (m:Node)
WHERE m.id = row[3]
MERGE (n:Node {id: row[0]})
SET n.name = row[1]
FOREACH (domain IN
    CASE
        WHEN row[2] = 'F' THEN ['friend']
        WHEN row[2] = 'P' THEN ['partner']
        ELSE []
    END |
    MERGE (t)-[:Edge {type: domain}]->(p)
);

It properly tests the first item in the row (not the entire row list -- which is not a string) for a starting '!'. It also uses a FOREACH clause to perform the conditional updates (which the Cypher CASE clause does not support on its own). This query also uses MERGE instead of the deprecated CREATE UNIQUE. And it also uses MERGE instead of CREATE to create the n nodes, to be sure that you don't produce duplicates (say, if you re-run the same query).