Are labels auto-indexed in Neo4j?

Question

Totally new to graph databases -- corrections welcome.

If I want to obtain a list of nodes labeled with the "User" label, does neo4j (or possibly other graph databases) need to search all nodes for that label or does it somehow auto-index nodes by label?

Without indexing, (horrible performance) every node is queried to see if any one of its labels matches "User," like so:

List<Node> userNodes = new List<Node>();
for (Node node : all_nodes)
{
  for (Label label : node.labels())
  {
    if (label.name() == "User")
    {
      userNodes.Add(node);

      // no need to look at other labels for this node
      break;
    }
  }
}
return userNodes;

With indexing, the system grabs some system-managed "node" that has all of the label names under it (search space of dozens instead of millions) and grabs its children:

  List<Node> userNodes = new List<Node>();
  for (Node labelNode : labels_node) // where labels_node is system-managed
  {
    if (labelNode.name() == "User")
    {
      // All children of the "User" node have the label "User"
      userNodes = labelNode.children();

      // No need to look at other labels
      break;
    }
  }
  return userNodes;

Ultimately, I think this question gets to this: if I am building a list of "things" for which I need to retrieve all of them by type of thing, should I use labels to accomplish this? Or should I instead create my own "Users" node, which points to all nodes that are users, and only use labels once I have found the subset of nodes I want?

It seems this question is similar though more vague but did not receive a satisfactory answer.

FrobberOfBits FrobberOfBits · Accepted Answer · 2015-01-26T19:32:01

Terminology wise, the docs talk about "labels and schema indexes". An "index" is a thing that you attach on a label property, such as indexing all first_name attributes of :Person nodes.

But for your question, labels behave like indexes because yes, the execution engine takes advantage of them and use them like you'd expect an index, even though the documentation doesn't talk about labels as indexes.

So, for a concrete example, suppose we had a graph of 1 million nodes, of which 5 of them had the label :Person. And suppose we had the following query:

MATCH (p:Person) RETURN p;

The question boils down to, how many nodes does cypher have to consider? The answer is 5, not 1 million.

Your second code snippet is more of a neo4j version 1.9 kind of approach; nowadays I wouldn't create these artificial "index nodes", and I wouldn't loop through all possible labels, I'd just match by label and be done with it.

Are labels auto-indexed in Neo4j?

3 Answers