I am making a Neo4j graph to show a network of music artists.
I have a CSV with a few columns. The first column is called Artist
and is the person who made the song. The second and third columns are called Feature1
and Feature2
, respectively, and represent the featured artists on a song (see example https://docs.google.com/spreadsheets/d/1TE8MtNy6XnR2_QE_0W8iwoWVifd6b7KXl20oCTVo5Ug/edit?usp=sharing)
I have merged so that any given artist has just a single node. Artists are connected by a FEATURED
relationship with a strength
property that represents the number of times someone has been featured. When the relationship is initialized, the relationship property strength is set to 1. For example, when (X)-[r:FEATURED]->(Y)
occurs the first time r.strength = 1
.
CREATE CONSTRAINT ON (a:artist) ASSERT a.artistName IS UNIQUE;
CREATE CONSTRAINT ON (f:feature) ASSERT f.artistName IS UNIQUE;
CREATE CONSTRAINT ON (f:feature1) ASSERT f.artistName IS UNIQUE;
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS from 'aws/artist-test.csv' as line
MERGE (artist:Artist {artistName: line.Artist})
MERGE (feature:Artist {artistName: line.Feature1})
MERGE (feature1:Artist {artistName: line.Feature2})
CREATE (artist)-[:FEATURES {strength:1}]->(feature)
CREATE (artist)-[:FEATURES {strength:1}]->(feature1)
Then I deleted the None node for songs that have no features
MATCH (artist:Artist {artistName:'None'})
OPTIONAL MATCH (artist)-[r]-()
DELETE artist, r
If X features Y on another song further down the CSV, the code currently creates another (duplicate) relationship with r.strength = 1
. Rather than creating a new relationship, I'd like to have only the one (previously created) relationship and increase the value of r.strength
by 1.
Any idea how can I do this? My current approach has been to just create a bunch of duplicate relationships, then go back through and count all duplicate relationships, and set
r.strength = #duplicate relationships
. However, I haven't been able to get this to work, and before I waste more time on this, I figured there is a more efficient way to accomplish this.
Any help is greatly appreciated. Thanks!
CREATE CONSTRAINT
clauses do anything for you, since you do not useartist
,feature
, andfeature1
as node labels. You should use this instead:CREATE CONSTRAINT ON (a:Artist) ASSERT a.artistName IS UNIQUE;
. Also, the property nameartistName
is a bit redundant forArtist
nodes; you may want to use the simplername
instead. – cybersam