0
votes

Almost all the elements of my neo4j database has a property but name of that property might be different among labels. I wrote a query like below

MATCH (n)
MATCH ()-[e]-()
unwind [n.start_t,e.start_t,n.end_t,e.end_t,n.begin,e.begin,n.end] as l
return MIN(l) as min, MAX(l) as max

This runs fine but although I have less than 1000 elements it takes time like 7 seconds. How could I make this query more efficient ?

With RDMS databases these type of queries are working a lot faster

2
For each node/relationship, will only one of those properties be present? Or could several be there on a single node/relationship?InverseFalcon
one or two will be present for eachcanbax

2 Answers

1
votes

Your query performs a Cartesian product between (n) and ()-[e]-(), that's likely why it is slow.

Do following instead:

MATCH (n)-[e]->()
unwind [n.start_t,e.start_t,n.end_t,e.end_t,n.begin,e.begin,n.end] as l
return MIN(l) as min, MAX(l) as max

Edit: If you have nodes without relationships use optional match:

MATCH (n)
OPTIONAL MATCH (n)-[e]->()
unwind [n.start_t,e.start_t,n.end_t,e.end_t,n.begin,e.begin,n.end] as l
return MIN(l) as min, MAX(l) as max
0
votes

As Frant pointed out the cross product here is killing your performance. Note that this isn't just a cross between all nodes and all relationships, it's actually against 2x all your relationships since without a direction in the pattern each relationship will be matched to twice, once in each direction.

I think you would be better served calculating the min and max for all nodes first, then the min and max for all relationships, then the final calculation among the final results (using APOC for getting max and min among a list):

MATCH (n)
UNWIND [n.start_t, n.end_t, n.begin, n.end] as l
WITH min(l) as min, max(l) as max
MATCH ()-[e]->()
UNWIND [e.start_t, e.end_t, e.begin, e.end] as l
WITH min, max, min(l) as minR, max(l) as maxR
RETURN apoc.coll.min([min, minR]) as min, apoc.coll.max([max, maxR]) as max

At the point where you have min and max from the nodes, you're back down to a single row and ready to tackle the rest without cardinality issues.

EDIT

The APOC functions here are the cleanest way to get min or max within a single row.

Otherwise we may have to do something like this:

MATCH (n)
UNWIND [n.start_t, n.end_t, n.begin, n.end] as l
WITH min(l) as min, max(l) as max
MATCH ()-[e]->()
UNWIND [e.start_t, e.end_t, e.begin, e.end] as l
WITH min, max, min(l) as minR, max(l) as maxR
RETURN CASE WHEN min < minR THEN min ELSE minR END as min, CASE WHEN max > maxR THEN max ELSE maxR END as max