0
votes

I've got a data model that I want to roughly model after the article posted in this Graphgist.

I'm curious as to the performance that I can expect on the WHERE clause in the case that a given set of 2 nodes has a large number of relationships between them with 'from' and 'to' parameters defined on each edge. When you do a match query like this where you have let's say 100 SELLS relationships, how does Neo4J handle performance of filtering down the edges to just the one(s) that matter based on the WHERE criteria:

MATCH (s:Shop{shop_id:1})-[r1:SELLS]->(p:Product)
WHERE (r1.from <= 1391558400000 AND r1.to > 1391558400000)
MATCH (p)-[r2:STATE]->(ps:ProductState)
WHERE (r2.from <= 1391558400000 AND r2.to > 1391558400000)
RETURN p.product_id AS productId,
   ps.name AS product,
   ps.price AS price
ORDER BY price DESC

I haven't found a way to index properties on an edge directly so I'm assuming that either the query optimizer can take care of something like this or it just literally traverses the array of edges and finds the one(s) that match.

1

1 Answers

1
votes

Neo4j will just traverse all relationships and read the property value. There are by default no indexes on relationships properties (this can be achieved with the legacy indexes : check documentation).

Concerning performance, bear in mind that Neo4j is very fast at traversing relationships so while your query is "very expensive", Neo4j can traverse 2 to 4 million relationships per second and per core depending on your hardware configuration.

So, to summarize, for 100 relationships it will run like a flash, but it is not optimized at all currently, so you'll see some drawbacks if you need to run the same operations on 1million relationships for example.