Neo4j labels, relationship types, and cypher matching performance

Question

Say I have a massive graph of users and other types of nodes. Each type has a label, some may have multiple labels. Since I am defining users and their access to nodes, there is one relationship type between users and nodes: CAN_ACCESS. Between other objects, there are different relationship types, but for the purpose of access control, everything involves a CAN_ACCESS relationship when we start from a user.

I never perform a match without using labels, so my intention and hope is that any performance downsides to having one heavily-used relationship type from my User nodes should be negated by matching a label. Obviously, this match could get messy:

MATCH (n:`User`)-[r1:`CAN_ACCESS`]->(n2)

But I'd never do that. I'd do this:

MATCH (n:`User`)-[r1:`CAN_ACCESS`]->(n2:`LabelX`)

My question, then is whether the use of labels on the destination side of the match is effectively equivalent to having a dedicated relationship type between a User and any given label. In other words, does this:

MATCH (n:`User`)-[r1:`CAN_ACCESS`]->(n2:`LabelX`)

Give me the same performance as this:

MATCH (n:`User`)-[r1:`CAN_ACCESS_LABEL_X`]->(n2)

If CAN_ACCESS_LABEL_X ALWAYS goes (n:`User`)-->(n:`LabelX`)?

Mark just wrote a great blog post about this, the performance is not totally the same but good enough: markhneedham.com/blog/2014/09/30/… — Michael Hunger
Woah! This really is great timing. I started looking into modying my app's code based off this and discovered a bug in Neo4j.rb in the process. Thanks! — subvertallchris

subvertallchris subvertallchris · Accepted Answer · 2014-10-04T14:01:30

As pointed out by Michael Hunger's comment, Mark Needham's blog post here demonstrates that performance is best when you use a dedicated relationship type instead of relying on labels.

Neo4j labels, relationship types, and cypher matching performance

1 Answers