Totally new to graph databases -- corrections welcome.
If I want to obtain a list of nodes labeled with the "User" label, does neo4j (or possibly other graph databases) need to search all nodes for that label or does it somehow auto-index nodes by label?
Without indexing, (horrible performance) every node is queried to see if any one of its labels matches "User," like so:
List<Node> userNodes = new List<Node>();
for (Node node : all_nodes)
{
for (Label label : node.labels())
{
if (label.name() == "User")
{
userNodes.Add(node);
// no need to look at other labels for this node
break;
}
}
}
return userNodes;
With indexing, the system grabs some system-managed "node" that has all of the label names under it (search space of dozens instead of millions) and grabs its children:
List<Node> userNodes = new List<Node>();
for (Node labelNode : labels_node) // where labels_node is system-managed
{
if (labelNode.name() == "User")
{
// All children of the "User" node have the label "User"
userNodes = labelNode.children();
// No need to look at other labels
break;
}
}
return userNodes;
Ultimately, I think this question gets to this: if I am building a list of "things" for which I need to retrieve all of them by type of thing, should I use labels to accomplish this? Or should I instead create my own "Users" node, which points to all nodes that are users, and only use labels once I have found the subset of nodes I want?
It seems this question is similar though more vague but did not receive a satisfactory answer.