Application Background: I am working on application which consists of client and server side. Server side consumes data every 5 minutes from some external API feeder, transforms it and save it into Neo4j graph database. Client side fetches all stored data by making a call to the server side and builds chart based on received data:
http://decisionwanted.com/decisions/2/bitcoin
More saving details: every time, for newly consumed data I create new history value nodes with new relationships to the existing Value (root) node:
Issue: Server side returns all stored so far data by applying following Cypher query:
MATCH (v:Value)-[rvhv:CONTAINS]->(hv:HistoryValue)
WHERE v.id = {valueId}
OPTIONAL MATCH (hv)-[ru:CREATED_BY]->(u:User)
WHERE {fetchCreateUsers}
RETURN ru, u, rvhv, v, hv
ORDER BY hv.createDate DESC
Since total data volume is increasing after each consume operation, query performance starts reducing and latency starts increasing.
Questions:
- At some moment, my graph will consist of more then ~100k value history nodes, which significantly decreases query performance. Thus, can anyone suggest better approach to store or retrieve such kind of data?
- Instead of sending all stored data at once, I want to be able to limit size of returned data based on time range received from client and step which determines number of elements to be skipped to the next value that should be returned.
For e.g.: There are 1000 history value nodes stored. And I want to return only every hundredth element, starting from 1st and ending 1000.
So the result set of the query should contains nodes 1, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000.
The approach looks good for me. The only problem is how can I tell Cypher query:
MATCH (v:Value)-[rvhv:CONTAINS]->(hv:HistoryValue)
WHERE v.id = {valueId}
OPTIONAL MATCH (hv)-[ru:CREATED_BY]->(u:User)
WHERE {fetchCreateUsers}
RETURN ru, u, rvhv, v, hv
ORDER BY hv.createDate DESC
to return only every hundredth element? Does anyone know how to do it?