1
votes

I'm curious what the best way to model enumerators are in Neo4j. Should they be nodes, relationships, properties, etc.?

enum Activity {
    BASKETBALL,  // ID: 1
    HOCKEY // ID: 2
}

For example, in SQL I could just make an enum table and have foreign key relationships (ID: 1, 2) pointing to that lookup table. Should I just have a node for each entry (BASKETBALL, HOCKEY) that would have been in that SQL enum table, or should it be in a label or property? Are there performance impacts by having, say, thousands or millions of nodes thus pointing to that one enum node, or is it more or less not really a concern?

I understand there might be cases for each, and if so, please explain when to use which.

1

1 Answers

2
votes

For this kind of modeling, nodes are the best approximation, with the label being the type, and a property on each for the value.

To model your enum example you might have:

(:Activity{name:'BASKETBALL'})
(:Activity{name:'HOCKEY'})

Then you can have relationships to these nodes as appropriate:

(:Person{name:'Matt'})-[:INTERESTED_IN]->(:Activity{name:'HOCKEY'})

This makes it work well for most kinds of queries (Give me information about Matt including what activities he's interested in; Is Matt interested in hockey? Which people are interested in hockey?)

In a case where you may have thousands or millions of nodes connected to the enum, performance impact really depends upon the direction you're traversing. If a single person only has one (or a few) relationships to :Activity nodes, then a query from persons to activities will be cheap.

However a query from the activity to persons may be more expensive. For example, if your hockey node has millions of connections, this kind of query could be a problem

...
// previously matched (p:Person) to all students at a school
// per student, find who else has a common interest in an activity
MATCH (p)-[:INTERESTED_IN]->()<-[:INTERESTED_IN]-(personWithCommmonInterest)
...

The first traversal in the match is cheap, since persons have few things they are interested in...but the second can be more expensive, as a great many people are interested in the same thing.