1
votes

I created a simple csv that has some boxing matches. I'm trying to figure out how to model this in Neo4j.

The csv looks like this:

enter image description here

My interest in practicing using this small dataset in Neo4j was because it seems like Neo4j would be a good way to easily query who fought who, and who had common opponents, or whatever.

My first thought was that naturally, each boxer should be represented in a 'boxer' node, and each fight should be represented in a 'fight' node.

After modeling it as such, I realized, that there isn't actually one node for each boxer, because over time, the boxer's age changes. So I realized that each boxer would have to have a separate node for each fight. For example, Glass Joe has 2 fights and thus he appears twice, once when he was 23 and once next year when he battled Sandman and he was 24:

enter image description here

But this kinda defeats the purpose. Now, my graph will be made up of disconnected sets of 3 nodes, one for each fight in the csv. So what's the purpose?

My question is, how can I model such a simple yet complex situation like this: some type of tournament or game that changes over time, and the properties of the competitors' nodes change -- yet we want the graph to be connected:

enter image description here

(oops: Sandman should now be 51)

But then again, I don't think the above image is correct -- the edges shown are actually properties of the boxer node. If they are properties of the boxer...then they don't belong on the edge, right?

Here is my code so far (and the csv lives here):

LOAD CSV WITH HEADERS FROM
'file:///<grab it from dropbox please!>' AS line
CREATE (b:boxer  {boxer_id: line.boxer_id, name: line.name})
SET b.age = TOINT(age);


LOAD CSV WITH HEADERS FROM
'file:///<grab it from dropbox please!>' AS line
MERGE(f:fight  {fight_id: line.fight_id});

I end up with these nodes:

enter image description here enter image description here

...but not sure how to connect them. Any advice or recommendations would be greatly appreciated.

1

1 Answers

1
votes

Your first instinct was right. Ideally if you had the boxer's birthday that's what you would store. That would also help you tell apart boxers who have the same name/nickname. Your idea of storing the boxer's age as part of the relationship is a good idea, though.

If you really wanted to store each node for each boxer for each row you could do the following:

(:BoxerRecord)-[:FOUGHT_IN]->(:Fight)
(:BoxerRecord)-[:REPRESENTS]->(:Boxer)

So basically you use the CREATE clause to create each BoxerRecord and MERGE for each Boxer record so that they get merged together.

Then if you wanted to find all of the boxers that two people have fought in common (I'm making up an :

MATCH
  (b1:Boxer {boxer_id: 100),
  (b2:Boxer {boxer_id: 101})
  (b1)<-[:REPRESENTS]-(:BoxerRecord)-[:FOUGHT_IN]->(:Fight)<-[:FOUGHT_IN]-(:BoxerRecord)-[:REPRESENTS]->(common_boxer:Boxer)<-[:REPRESENTS]-(:BoxerRecord)-[:FOUGHT_IN]->(:Fight)<-[:FOUGHT_IN]-(:BoxerRecord)-[:REPRESENTS]->(b2)
RETURN common_boxer, count(*)