0
votes

We have the following data model in a Neo4j database. Nodes with label Post, and nodes with User

The relation between Post and User is (Post)-[:POSTED_BY]->(User). It indicates that a post was created by the given user.

On Post nodes, there is a property named sort_order which gives the order by which post nodes should be retrieved. The sort_order is unique and index-backed; a unique constraint is enforced in the schema.

In a listing request, I need to get the posts for a given user ordered by sort_order, and in pages of 10 (for example)

According to Neo4j Documentation, the order operation is skipped by the planner because the nodes are already ordered based on the index of the selected property sort_order. This limits the database hits and time required to complete the query by the selected page size. This applies perfectly for such a query:

MATCH (post:Post)
where post.sort_order > 0
return post
order by post.sort_order desc
limit 10

if the hint post.sort_order > 0 is given.

Once I need to retrieve the posts by a given User, the query will fetch all Post nodes related to the selected User ordering them in memory, and returning the current page (of 10 records). After profiling the following required query, it appears to ignore the index.

MATCH (post:Post)-[:IN]->(user:User {id: 3})
where post.sort_order > 0
return post
order by post.sort_order desc
limit 10

Is there are any hidden problems in the query or the data model ?

Or, if any optimizations can be done to the query, data model, or indices.

The used neo4j version is 3.5.23

1
You can evaluate the performance with EXPLAIN/PROFILE as explained here: neo4j.com/docs/cypher-manual/current/query-tuning/… - fbiville

1 Answers

0
votes

The Cypher planner probably determined (using the server's DB statistics) that it was generally more efficient to execute your last query by looking up the User node (hopefully you have an :User(id) index to help with that), getting its related Post nodes, and then sorting.

If you were successful at forcing the planner to use your :Post(sort_order) index instead, then the execution would probably be much less efficient. That is because the query would still have to get and evaluate the related User nodes for (presumably almost) every Post.