0
votes

When using LIMIT with ORDER BY, every node with the selected label still gets scanned (even with index).

For example, let's say I have the following:

MERGE (:Test {name:'b'})
MERGE (:Test {name:'c'})
MERGE (:Test {name:'a'}) 
MERGE (:Test {name:'d'})

Running the following gets us :Test {name: 'a'}, however using PROFILE we can see the entire list get scanned, which obviously will not scale well.

MATCH (n:Node)
RETURN n
ORDER BY n.name
LIMIT 1

I have a few sorting options available for this label. the order of nodes within these sorts should not change often, however, I can't cache these lists because each list is personalized for a user, i.e. a user may have hidden :Test {name:'b'}

Is there a golden rule for something like this? Would creating pointers from node to node for each sort option be a good option here? Something like

(n {name:'a'})-[:ABC_NEXT]->(n {name:'b'})-[:ABC_NEXT]->(n {name:'c'})-...

Would I be able to have multiple sort pointers? Would that be overkill?

Ref:

  1. https://neo4j.com/blog/moving-relationships-neo4j/
  2. http://www.markhneedham.com/blog/2014/04/19/neo4j-cypher-creating-relationships-between-a-collection-of-nodes-invalid-input/
1

1 Answers

0
votes

Here's what I ended up doing for anyone interested:

// connect nodes

MATCH (n:Test)
WITH n
ORDER BY n.name
WITH COLLECT(n) AS nodes
FOREACH(i in RANGE(0, length(nodes)-2) | 
  FOREACH(node1 in [nodes[i]] | 
    FOREACH(node2 in [nodes[i+1]] | 
      CREATE UNIQUE (node1)-[:IN_ORDER_NAME]->(node2))))

// create list, point first item to list

CREATE (l:List { name: 'name' })
WITH l
MATCH (n:Test) WHERE NOT (m)<-[:IN_ORDER_NAME]-()
MERGE (l)-[:IN_ORDER_NAME]->(n)

// getting 10 nodes sorted alphabetically

MATCH (:List { name: 'name' })-[:IN_ORDER_NAME*]->(n)
RETURN n
LIMIT 10