0
votes

How to make the query below work seconds instead of minutes?

I'm new to graph databases. Am I right if I say that node indexing won't help to speed up my query? As I understand, indexes helps to find start point of traversal, not for traversing itself.

May relationship indexing be helpful in my case?

Query

I have 2,500 nodes of type COLUMN and 52,000 relationships between nodes.

The query below is too slow, I even don't know how slow is it. It takes more than 5 minutes, than I get java.net.SocketTimeoutException.

Query

MATCH path = (start:PERSON)-[r:MET_REL*2..5]->(person:PERSON) 
WHERE start.ID = '385' 
WITH path UNWIND NODES(path) AS col
WITH path, 
COLLECT(DISTINCT col.COUNTRY_ID) as distinctCountries
WHERE LENGTH(path) + 1 = SIZE(distinctCountries)
RETURN path

P.S.

Moreover, I want to do [r:MET_REL*2..25] instead of [r:MET_REL*2..5]

1
Sometimes queries aren't actually doing what you think they're doing, but it's hard for us to tell without a verbal description of your requirements. Can you explain the purpose of the query, what you want it to be doing, and what the output is supposed to be? - InverseFalcon
The problem is that the amount of data for the described model very high. I added specified the number of nodes and 52,000 random relationships between them. The number of paths of length equal to 4 on this model - 3847930. If the length is equal to 5, their number will grow very strongly. Like and search time. What to say about the length of the path 25? And as @InverseFalcon said you need to give more explanation of your original problem - maybe it will be necessary to change the data model. - stdob--
can you share the query plan for this query? Best via prefixing it with PROFILE - Michael Hunger

1 Answers

1
votes

Make sure you have an index/constraint on :PERSON(ID)

Please try this:

MATCH path = (start:PERSON)-[:MET_REL*2..5]->(person:PERSON) 
WHERE start.ID = '385' 
WITH path, reduce(a=[], n in nodes(path) | case when n.COUNTRY_ID IN a then a else a + [n.COUNTRY_ID] end) as countries
WHERE LENGTH(path) + 1 = SIZE(distinctCountries)
RETURN path

With APOC there is an apoc.coll.toSet function that you could use on the countries.