I need an advice about performance improving of social graph. The target query works fine with small results number. But it may return large results with more than 1000 rows. Can the performance be tuned on large respond of cypher query?
The cypher query is used:
START givenFriend=node:Nodes('id:709387498'),
item=node:ItemCat1Cat2('category:a.b')
MATCH p = givenFriend-[:FRIEND]-friend1-[:FRIEND]-friend2-[:DATA]->item
RETURN p, item
Neo4j core 1.9.5
The graph contains connected friends:
friend1Node-[:FRIEND]->friend1Node
A friend can have several data items which are represented as nodes with properties:
friendNode-[:DATA]->DataNode
A data node has about 8 properties. Among them is a category property. The data item nodes are indexed by category.
Friend nodes number: 650,772
Friend relationship number: 842,755
Data item nodes number: 5,640
The query which demands improvement should select all paths from a given node id to data item with defined category through 2 friends. The paths have the following view:
givenFriend-friend1-friend2-dataItem
Can traversal improve the performance?
Can migration to 2.0.0 improve the db model and query performance?
**UPD
- I use php library https://github.com/jadell/neo4jphp But I'm open for other variants. Right now I'm looking at neoism(Golang). Also I considered using neo4j extension to perform a query. The target query is tested through the neo4j dashboard as well. So the client layer was absent.
- Fresh version of the php lib is using X-Stream. Mine is not. But as a query was tested without a client then this factor can be omitted.
- The question was good. I've tuned the query - it returns not a node but properties which I need and the performance is improved a bit.
- If I understand you correctly about SLA - such type of requests should work with concurrency 100 and the allowable respond time 2s per request. The query respond time through the dashboard:
LIMIT 1 = 195ms
LIMIT 100 = 564ms
LIMIT 1000 = 1549ms
LIMIT 3000 = 3208ms
SKIP 7000 LIMIT 1 = 2051ms
The respond can contain up to 13K records.