I have 2 nodes. First of them "b1" has 16m relationships and second one "b" - 17k. Label B is indexed on the id property.
My query to retrieve if they have a direct relation is:
profile
MATCH (b:B {id :'D006019' }) WITH b
MATCH (b1:B {id :'D006801' }) WITH b, b1
MATCH (b)-[r]-(b1) RETURN r
Several observations:
Query is extremely slow. It's running for like 5 mins. First it makes a nodeindexscan which is very fast, but somehow it manages to grab the node b1 and continues execution with expanding this node. Byt "b1" has 16m relations and this with the following filter ruins the performance
I can make this query fast enough if I change it a little.
Here is the much faster query:
profile
MATCH (bB {id :'D006019' }) WITH b
MATCH (b1:B) WHERE b1.id IN ['D006801' ] WITH b, b1
MATCH (b)-[r]-(b1) RETURN r
So now "b1" is in "IN" clause and neo4j starts expanding over "b" which has only 17k relations and the query executes around 100 ms.
My question is: can the query be written in a way that neo4j expands automatically on the less connected node.
MATCH (b1:B {id:'D006801'})-[r]-(b:B {id :'D006019' }) RETURN ris not enough (and faster)? - Bruno Peres