2
votes

I'm trying to load some data into neo4j from csv files, and it seems a unique constraint error is triggered when it shouldn't. In particular, I created a contraint using

CREATE CONSTRAINT ON (node:`researcher`) ASSERT node.`id_patstats` IS UNIQUE;

Then, after inserting some data in neo4j, if I run (in neo4j browser)

MATCH (n:researcher {id_patstats: "2789"})
RETURN n

I get no results (no changes, no records), but if I run

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
'file:///home/manu/proyectos/PTL_RDIgraphs/rdigraphs/datamanager/tmp_patents/person906.csv' AS line
MERGE (n:researcher {`name` : line.`person_name`})
SET n.`id_patstats` = line.`person_id`;

I get

Neo.ClientError.Schema.ConstraintValidationFailed: Node(324016) already exists with label researcher and property id_patstats = '2789'

and the content of file person906.csv is

manu@cochi tmp_patents $cat person906.csv
person_id,person_name,doc_std_name,doc_std_name_id
2789,"li, jian",LI JIAN,2390

(this a minimum non working example extracted from a larger dataset; also, in the original "person906.csv" I made sure that "id_patstats" is really unique).

Any clue?

EDIT:

Still struggling with this...

If I run

MATCH (n) 
WHERE EXISTS(n.id_patstats) 
RETURN DISTINCT "node" as entity, n.id_patstats AS id_patstats 
LIMIT 25 
UNION ALL 
MATCH ()-[r]-() 
WHERE EXISTS(r.id_patstats) 
RETURN DISTINCT "relationship" AS entity, r.id_patstats AS id_patstats 
LIMIT 25

(clicking in the neo4j browser to get some examples of the id_patstats property) I get

(no changes, no records)

that is, id_patstats property is not set anywhere. Moreover

MATCH (n:researcher {`name` : "li, jian"})
SET n.`id_patstats` = XXX;

this will always trigger an error regardless of XXX, which (I guess) means the actual problem is that the name "li, jian" is already present. Although I didn't set any constraint on the name property, I'm guessing neo4j goes like this: you are trying to set a UNIQUE property on a node matching a property (name) that is not necessarily UNIQUE; hence that match could yield several nodes and I can't set the same UNIQUE property on all of them...so I won't even try

1
Your realization at the end is correct. The unique constraint is getting violated because your MATCH to 'li, jian' matches to multiple nodes, and when you try to set the unique property key of all those nodes to the same value it violates your unique constraint.InverseFalcon

1 Answers

1
votes

At least two of your researchers have the same name. You shouldn't MERGE by name and then add id as a property. You should MERGE by id and add the name as a property and it will work fine.

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
'file:///home/manu/proyectos/PTL_RDIgraphs/rdigraphs/datamanager/tmp_patents/person906.csv' AS line
MERGE (n:researcher {`id_patstats`:line.`person_id`})
SET n.name`=line.`person_name`;