0
votes

I am making a Neo4j graph to show a network of music artists.

I have a CSV with a few columns. The first column is called Artist and is the person who made the song. The second and third columns are called Feature1 and Feature2, respectively, and represent the featured artists on a song (see example https://docs.google.com/spreadsheets/d/1TE8MtNy6XnR2_QE_0W8iwoWVifd6b7KXl20oCTVo5Ug/edit?usp=sharing)

I have merged so that any given artist has just a single node. Artists are connected by a FEATURED relationship with a strength property that represents the number of times someone has been featured. When the relationship is initialized, the relationship property strength is set to 1. For example, when (X)-[r:FEATURED]->(Y) occurs the first time r.strength = 1.

CREATE CONSTRAINT ON (a:artist) ASSERT a.artistName IS UNIQUE;
CREATE CONSTRAINT ON (f:feature) ASSERT f.artistName IS UNIQUE;
CREATE CONSTRAINT ON (f:feature1) ASSERT f.artistName IS UNIQUE;

USING PERIODIC COMMIT
    LOAD CSV WITH HEADERS from 'aws/artist-test.csv' as line
MERGE (artist:Artist {artistName: line.Artist})
MERGE (feature:Artist {artistName: line.Feature1})
MERGE (feature1:Artist {artistName: line.Feature2})
CREATE (artist)-[:FEATURES {strength:1}]->(feature)
CREATE (artist)-[:FEATURES {strength:1}]->(feature1)

Then I deleted the None node for songs that have no features

MATCH (artist:Artist {artistName:'None'})
OPTIONAL MATCH (artist)-[r]-() 
DELETE artist, r    

If X features Y on another song further down the CSV, the code currently creates another (duplicate) relationship with r.strength = 1. Rather than creating a new relationship, I'd like to have only the one (previously created) relationship and increase the value of r.strength by 1.

Any idea how can I do this? My current approach has been to just create a bunch of duplicate relationships, then go back through and count all duplicate relationships, and set r.strength = #duplicate relationships. However, I haven't been able to get this to work, and before I waste more time on this, I figured there is a more efficient way to accomplish this.

Any help is greatly appreciated. Thanks!

1
Check this answer. See if it helps.Gandalf
By the way: none of your CREATE CONSTRAINT clauses do anything for you, since you do not use artist, feature, and feature1 as node labels. You should use this instead: CREATE CONSTRAINT ON (a:Artist) ASSERT a.artistName IS UNIQUE;. Also, the property name artistName is a bit redundant for Artist nodes; you may want to use the simpler name instead.cybersam
Okay, makes sense. Thank you.Tim Holdsworth

1 Answers

0
votes

You can use MERGE on relationships with ON MATCH SET

USING PERIODIC COMMIT
    LOAD CSV WITH HEADERS from 'aws/artist-test.csv' as line
MERGE (artist:Artist {artistName: line.Artist})
MERGE (feature:Artist {artistName: line.Feature1})
MERGE (feature1:Artist {artistName: line.Feature2})

MERGE (artist)-[f1:FEATURES]->(feature) 
ON CREATE SET f1.strength = 1
ON MATCH SET f2.strength = f1.strength + 1

MERGE (artist)-[f2:FEATURES]->(feature1) 
ON CREATE SET f2.strength = 1
ON MATCH SET f2.strength = f2.strength + 1